The open–source for clinical programming discussions are everywhere these days, and at industry conferences, there are almost as many R presentations as there are SAS® presentations. It’s a very exciting time, but also a little daunting as companies tackle the need to upskill teams of tenured SASprogrammers to R.  

The next few years are going to be full of learning as open–source finds a prominent place in clinical programming, but we believe we have found a successful approach to taking experts in SAS and training them to use R for clinical trials. In this blog, we’re going to discuss the best training methods we’ve found to efficiently upskill teams of tenured SAS clinical programmers to R.  


First, let’s talk about training content.  It’s not difficult to find resources that teach how to program in R but there is so much that R can do that those resources can be overwhelming. Plus, many resources are irrelevant to what a clinical programmer needs to know to complete their day–to–day job.  The R programming language is used by data practitioners around the world, so most of the material that exists is hard to sift through and a learner has no clue where to start or how to apply what they find to their daily tasks. 


To effectively upskill learners who are already experts in SAS, it’s important to leverage all the skills they already have and to give them targeted material to focus on.  Remember, we’re talking about SAS programming experts.  We need to use the fact that they know SAS to jumpstart their R journey, but how exactly can we do that? 

You can leverage their existing SAS knowledge with side-by-side comparisons of SAS and R code.    

Leveraging SAS Knowledge2 (1)

Learners can see code they’re familiar with, in this case, PROC SORT, and the corresponding R code that would get those same results, in this case, the arrange() function. 

sas code

Code can be expanded, in this case adding a descending sort, and you can see the additional code to replicate that added concept in R.   

Using side-by-side SAS and R code ensures the content is familiar.  The hundreds of R functions don’t feel as daunting when learners can see them in a way that matches them up with the SAS code they know and love.  This takes away the huge task of getting people to understand R because it’s a translation of something they know, not an introduction to something completely different. 


Once SAS skills are translated to R, learners have to stop thinking like a SAS programmer.  They should be taught to embrace base R as well as the many packages of the Tidyverse and Pharmaverse.  Translating their SAS skills already helps introduce so many of these packages, but you can immerse them in the rest of the R world by leveraging their clinical programming expertise.  To do this you can use CDISC datasets and examples throughout the training.   

Leveraging Clinical Programming Knowledge Image

Using this example, a clinical programmer is going to be familiar with USUBJID and ARM variables and derive an intent to treat the flag.  So before this R code is even explained to them they can likely use context clues to guess what it’s doing. 

So many R books show examples with random data, but who wants to learn R using a data set with the make and model of a car? By using CDISC datasets, learners see examples that use a language they’re familiar with.  This adds familiarity to their new skill so it’s not so overwhelming. 


Another key piece of training is showing the learner the training content is relevant.  Using CDISC datasets and examples helps learners see how things fit into their day-to-day programming, but training should also cover task–specific things such as creating datasets and displays. By the end of the training, a learner should understand how to replicate some of the most common clinical programming tasks, such as programming SDTMs, ADaMs, tables, and figures. This makes what they’re learning feel practical and immediately applicable, because they’ve seen how to use R to complete the tasks they do every day.  


So far we’ve talked about how to ensure success in the short term, but what about the long term? Recall, there’s a ton of R material out there, so how can we use that to our advantage?  A crucial part of training content is teaching learners how to use the resources and documentation they can find outside of the training as well.  Once they’ve got some training under their belt, and they’ve learned how to utilize R learning materials, the amount of information out there is no longer overwhelming, it’s exciting.  The opportunity to learn is endless, and they’ve been given the skills to not only walk away knowing R, but also knowing how to use R resources and documentation to continue to self-train on the rapidly evolving pharma open-source landscape.   


Next, we’ll discuss the training format.  Let’s face it, SAS clinical programmers are busy. And now we’re going to ask them to learn on top of everything they already have going on for projects? The challenge here is to find a format that keeps training interesting, engaging, and best fitting into an already packed workday.  Let’s discuss how you can do that. 


With any training, there is often the discussion of individual self–paced eLearning versus live group sessions. Everyone learns in different ways, and both have pros and cons. In our experience, people do not make the time when they’re asked to learn on their own – it’s too easy for other obligations to take precedence, and learning takes a back burner. Because of this, we recommend live group sessions allowing you to join people together and keep training a priority. In modern times, live can mean in–person or virtually – it often depends on the learners’ locations. If all learners are in one place, you can maximize focus with in–person sessions, but in the remote landscape, many companies work in, live virtual sessions certainly work and are the best choice logistically.  

Keep in mind, not everyone thrives in a live group environment. Therefore, to account for the people who may prefer individual self–paced learning, it’s beneficial to also provide reference materials and eLearnings that learners can optionally use to complement the live group sessions. Think of it like they attend the live session to have an introduction to the material, but they have handouts and self-paced videos they can refer to in weeks or months that remind them of everything they learned. 


Once you’ve determined your overall format for training, it’s time to think about how to convey the material in a session. Learners need an opportunity to practice to truly understand concepts. 

To maximize what a learner can get out of a live session, it’s best to use a combination of lecture and guided practice. The lecture introduces the topics, shows examples, and provides a way for learners to ask questions. But most learners don’t fully understand a concept until they can practice, which is where guided practice comes in. In guided practice, learners are given exercises to practice the concepts directly after learning them.  

Let’s explore the guided practice a little more.  Maybe you’re thinking my company doesn’t yet have a shared R environment, so we couldn’t do that.  Or maybe your company does have a shared environment but you have no clue how to utilize it for effective training. Posit Cloud is a web service that delivers a browser-based experience similar to RStudio. Using Posit Cloud allows trainers to install any necessary packages and load some starting files, such as data and helper code so all learners are starting from the same set-up and files. The key benefit here is that learners can focus on the material, not the environment itself.   


Finally, in regards to format, you may be thinking what is the time commitment? Ultimately, that depends on the group of learners and their project workload.  

We’ve found the most success in 1.5-2 hour sessions every week. 1.5-2–hour sessions are long enough that you can introduce and practice a helpful amount of content, but short enough that learners can focus and that they are not losing their entire day to training.  And having a session each week is frequent enough to keep the material fresh in a learner’s mind, but sets a pace that if someone misses a session or wants to do some additional learning following a session, they have time before needing to attend the next session. 


Now, let’s talk about training candidates.  Choosing candidates from your team is a significant decision.  Learning a new skill is a big change, and it’s important to select people who are excited about the opportunity.   


When selecting candidates it’s important to find the early adopters. These are people in the group who make the most sense to have trained first and they will become the champions for the success of the training. Candidates must be eager to learn, and also that they can be given the time to take on the training commitment. 

With the candidates, the aim should be to create a supportive environment where the learners can learn together and provide feedback to improve the training program.  And ultimately it’s about ensuring you have early adopters of the R training who can be the catalysts for driving change within the team and creating a culture of continuous learning.  


Now once you have these candidates, how do you hold them accountable to get the most out of the training?  Attending classes is not enough, and the new skills acquired during training sessions can easily be lost without continuous practice.  To address this issue it’s helpful to establish some guardrails to ensure accountability for learning. For example, we’d recommend that each group has a leader that ensures the learners attend, stay engaged, and are working to complete all assignments.  And it’s also really important that the motivation to learn R comes from department and company leaders. An expectation should be set that the team will become a multilingual place with all the skills needed to function effectively within the department. And beyond that, it’s important to start to think about how these new skills can start to be implemented in project work so the learners can continue to use their new skills as part of their job. 


There are many effective ways to train, however, when you need to train people who have done the same thing for years to do something entirely different, there are ways to make the process more effective.  

When training SAS clinical programmers in R, the best training programs will leverage their expertise. You have to provide a translation of skills rather than an introduction to something completely different. And to make the training click, it has to be relevant. You can’t make busy learners waddle through a lot of generic stuff to get to the applicable things 

Ultimately training has to show learners how they can use R to do the things they do daily, but also give them the tools to take R beyond that. R is not replacing SAS, but R adds to a clinical programmer’s tool kit to maximize efficiency and allow them to pick the best solution for the problem.  

Back to Blog