Home / Posts / 2014 February 24th

S and R Development


Response to the reading: "From S to R: 25 Years of AT&T Leadership in Statistical Computing" (Staff et al, 2013)

History of S and R programming languages

As a rule, when a programmer needs something, he/she programs it. Creating a form of artificial life that requires you and your computer, and occasionally lots of coffee. Sometimes these programs can be almost beautifully simple, and (user) friendly. Others look more like monsters, ripping us apart until we finally delete or abandon them in frustration.

What causes some software to live, multiplying like rabbits, and some to die quietly in a corner on the creators file system? An example of a successful case can be found in the article “From S to R: 35 Years of AT&T Leadership in Statistical Computing.” The writers focus on data analysis but the principles should be taught in any software development course.

3 Ideas

One, software must be beautifully useful. I say “beautifully useful” because there are plenty of programs that are “useful” but so frustratingly difficult to use that few would-be-users if any get over the initial learning barrier. During the development of S in the 1970’s, Becker and Chambers put some thought into designing a programming language and environment that was statistician friendly during a time of engineering-centric programming and data-type dependence. As a result, S was based entirely on functions. These functions were applied to vectors of data, regardless of data type. S functions were run line by line (similar to terminal) where results and intermittent results were immediately available and debuggable. What was once heavy duty and slow became incremental and fast. The heavy duty tools of data analysis became toys in a playground of data exploration.

Two, software must be easily distributed. In some ways, I think the initial developers of S were lucky. They were able to catch the Unix wave just when it was freely and therefore widely available. Anyone who downloaded Unix could start learning S. R showed up when S was no longer free.

Three, software must be extendible. In my opinion, a component of the success of Amazon and YouTube are because of their user-generated content. For each product on Amazon or each video on YouTube, comments create user buy-in. The platform is an interactive experience where the users create content and develop into a loyal community. In the same way, software also creates a type of community. Instead of content, users are generating packages or extensions. There are online forums to post bug reports or suggestions for new features. There are new packages to download, try out, and critique. The foundation of the software (core) must be strong enough to withstand the edits of the masses each developing and customizing according to their own needs. This endless customization keeps the tool relevant to today’s problems. The software grows like some strange ameba, stretching out arms in different directions, engulfing certain functions and expelling others. A type of evolutionary growth as what is useful is kept and what is not withers away.

These three points create a piece of software that is self sustaining. A well thought out design sets the stage/environment for the growth of a community of users invested in a tool to continue development, continue improvement and eventually create results never before imagined.

© Jennifer Chang 2020 · Last Update: 2020-10-15