Monday, 21 April 2014

How has an increase in system complexity affected new programmers?



This Q&A is part of a weekly series of posts highlighting common questions encountered by technophiles and answered by users at Stack Exchange, a free, community-powered network of 100+ Q&A sites.

Adam asked:


As a "new" programmer (I first wrote a line of code in 2009), I've noticed it's relatively easy to create a program that exhibits quite complex elements today with things like .NET framework, for example. Creating a visual interface or sorting a list can be done with very few commands now.

When I was learning to program, I was also learning computing theory in parallel. Things like sorting algorithms, principles of how hardware operates together, Boolean algebra, and finite-state machines. But I noticed if I ever wanted to test out some very basic principle I'd learned in theory, it was always a lot more difficult to get started because so much technology is obscured by things like libraries, frameworks, and the OS.

Making a memory-efficient program was required 40/50 years ago because there wasn't enough memory and it was expensive, so most programmers paid close attention to data types and how the instructions would be handled by the processor. Nowadays, some might argue that due to increased processing power and available memory, those concerns aren't a priority.

My question is if older programmers see innovations like these as a godsend or an additional layer to abstract through, and why might they think so? And do younger programmers benefit more in learning low-level programming BEFORE exploring the realms of expansive libraries? If so, then why?

See the full, original question here.

Learning should be easy and get harder


Eric Lippert answers (150 votes):


Having cheap memory, enormous disks, and fast processors aren't the only things that have freed people from the need to obsess over every byte and cycle. Compilers are now far, far better than humans at producing highly optimized code when it matters.

Moreover, let's not forget what we're actually trying to optimize for, which is value produced for a given cost. Programmers are way more expensive than machines. Anything we do that makes programmers produce working, correct, robust, fully-featured programs faster and cheaper leads to the creation of more value in the world.

My question though is how do people feel about this "hiding" of lower-level elements. Do you older programmers see it as a godsend or an unnecessary layer to get through?
It is absolutely necessary to get any work done. I write code analyzers for a living; if I had to worry about register allocation or processor scheduling or any of those millions of other details then I would not be spending my time fixing bugs, reviewing performance reports, adding features, and so on.

All of programming is about abstracting away the layer below you in order to make a more valuable layer on top of it. If you do a "layer cake diagram" showing all the subsystems and how they are built on each other you'll find that there are literally dozens of layers between the hardware and the user experience. I think in the Windows layer cake diagram there's something like 60 levels of necessary subsystems between the raw hardware and the ability to execute "hello world" in C#.

Do you think younger programmers would benefit more learning low-level programming BEFORE exploring the realms of expansive libraries?
You put emphasis on BEFORE, so I must answer your question in the negative. I'm helping a 12 year old friend learn to program right now and you'd better believe I'm starting them in Processing.js and not x86 assembler. If you start a young programmer in something like Processing.js they'll be writing their own shoot-em-up games in about eight hours. If you start them in assembler they'll be multiplying three numbers together in about eight hours. Which do you think is more likely to engage the interest of a younger programmer?

Now if the question is "do programmers who understand layer n of the cake benefit from understanding layer n - 1?" the answer is yes, but that's independent of age or experience; it's always the case that you can improve your higher level programming by understanding better the underlying abstractions.

Related: "What are the advantages of developing with outdated technologies?"

Beware bloated data structure


Mike Dunlavey answers (44 votes):


I had ideas on this subject, and I put them into a book 20 years ago.

It's long out of print, but you can still get used copies on Amazon.

One simple answer to your question is as old as Aristotle: Nature abhors a vacuum.

As much as machines have gotten faster and bigger, software has gotten slower and bigger.

To be more constructive, what I proposed was that information theory, and its direct relevance to software, be part of computer science education. It is only taught now, if at all, in a very tangential way.

For example, the big-O behavior of algorithms can be very neatly and intuitively understood if you think of a program as a Shannon-type information channel, with input symbols, output symbols, noise, redundancy, and bandwidth.

On the other hand, the productivity of a programmer can be understood in similar terms using Kolmogorov information theory. The input is a symbolic conceptual structure in your head, and the output is the program text that comes out through your fingertips. The programming process is the channel between the two. When noise enters the process, it creates inconsistent programs (bugs). If the output program text has sufficient redundancy, it can permit the bugs to be caught and corrected (error detection and correction). However, if it is too redundant, it is too large, and its very size, combined with the error rate, causes the introduction of bugs.

As a result of this reasoning, I spent a good part of the book showing how to treat programming as a process of language design, with the goal of being able to define the domain-specific-languages appropriate for a need. We do pay lip service to domain-specific-languages in CS education but, again, it is tangential.

Building languages is easy. Every time you define a function, class, or variable, you are adding vocabulary to the language you started with, creating a new language with which to work. What is not generally appreciated is that the goal should be to make the new language a closer match to the conceptual structure of the problem. If this is done, then it has the effect of shortening the code and making it less buggy simply because, ideally, there is a 1-1 mapping between concepts and code.

If the mapping is 1-1, you might make a mistake and code a concept incorrectly as a different concept, but the program will never crash, which is what happens when it encodes no consistent requirement.

We are not getting this. For all our brave talk about software system design, the ratio of code to requirements is getting bigger, much bigger.

It's true, we have very useful libraries. However, I think we should be very circumspect about abstraction. We should not assume if B builds on A and that is good, that if C builds on B it is even better. I call it the "princess and the pea" phenomenon. Piling layers on top of something troublesome does not necessarily fix it.

To terminate a long post, I've developed a style of programming (which sometimes gets me in trouble) where:

Invention is not a bad thing. It is a good thing, as it is in other branches of engineering. Sure it may be creating a learning curve for others, but if the overall result is better productivity, it is worthwhile.
Haiku-style minimalist code is valued. That especially goes for data structure design. In my experience, the biggest problem in software these days is bloated data structure.

All hail abstractions


Robert Harvey answers (31 votes):


High-level abstraction is essential to achieving ongoing progress in computing.

Why? Because humans can only hold so much knowledge in their heads at any given moment. Modern, large scale systems are only possible today because you can leverage such abstractions. Without those abstractions, software systems would simply collapse under their own weight.

Every time you write a method, you're creating an abstraction. You're creating a bit of functionality that's hidden behind a method call. Why do you write them? Because you can test the method, prove it works, and then invoke that functionality any time you want just by making the method call, and you don't have to think anymore about the code that's inside that method.

In the early days of computing, we used machine language. We wrote very small, bare metal programs with intimate knowledge of the hardware we were writing them for. It was a painstaking process. There were no debuggers; your program usually either worked, or it crashed. There was no GUI; everything was either command-line or batch process. The code you wrote would only work on that particular machine; it would not work on a machine with a different processor or operating system.

So we wrote high-level languages to abstract all of that detail away. We created virtual machines so that our programs could be portable to other machines. We created garbage collection so that programmers wouldn't have to be so diligent about managing memory, which eliminated a whole class of difficult bugs. We added bounds checking to our languages so that hackers couldn't exploit them with buffer overruns. We invented Functional Programming so that we could reason about our programs in a different way, and we rediscovered it recently to take better advantage of concurrency.

Does all this abstraction insulate you from the hardware? Sure it does. Does living in a house instead of pitching a tent insulate you from nature? Absolutely. But everyone knows why they live in a house instead of a tent, and building a house is a completely different ball game than pitching a tent.

Yet, you can still pitch a tent when it is necessary to do that, and in programming, you can (if you're so inclined) still drop down to a level closer to the hardware to get performance or memory benefits that you might not otherwise achieve in your high-level language.

Can you abstract too much? "Overtake the plumbing," as Scotty would say? Of course you can. Writing good APIs is hard. Writing good APIs that correctly and comprehensively embody the problem domain, in a way that is intuitive and discoverable, is even harder. Piling on new layers of software isn't always the best solution. Software Design Patterns have, to some degree, made this situation worse, because inexperienced developers sometimes reach for them when a sharper, leaner tool is more appropriate.

A broken dream


david.pfx answers (4 votes):


The increase in the complexity of systems is relentless, oppressive, and ultimately crippling. For me as an older generation programmer, it is also bitterly disappointing.

I've been programming for well over 40 years, having written code in 50-100 different languages or dialects, and become expert in 5-10. The reason I can claim so many is that mostly they're just the same language, with tweaks. The tweaks add complexity, making every language just a little different.

I have implemented the same algorithms innumerable times: collections, conversions, sort and search, encode/decode, format/parse, buffers and strings, arithmetic, memory, I/O. Every new implementation adds complexity, because every one is just a little different.

I wonder at the magic wrought by the high flying trapeze artists of the Web frameworks and mobile apps, at how they can produce something so beautiful in such a short time. Then I realize how much they don't know, how much they will need to learn about data or communications or testing or threads or whatever before what they do becomes useful.

I learned my craft in the era of fourth generation languages, where we genuinely believed that we would produce a succession of higher and higher level languages to progressively capture more and more of the repetitive parts of writing software. So how did that turn out, exactly?

Microsoft and IBM killed that idea by returning to C for writing apps for Windows and OS/2, while dBase/Foxpro and even Delphi languished. Then the Web did it again with its ultimate trio of assembly languages: HTML, CSS, and JavaScript/DOM. It's been all downhill from there. Always more languages and more libraries and more frameworks and more complexity.

We know we should be doing it differently. We know about CoffeeScript and Dart, about Less and Sass, about template to avoid having to write HTML. We know and we do it anyway. We have our frameworks, full of leaky abstractions, and we see what wonders can be done by those chosen few who learn the arcane incantations, but we and our programs are trapped by the decisions made in the past. It's too complicated to change or start over.

The result is that things that ought to be easy are not easy, and things that ought to be possible are nearly impossible, because of complexity. I can estimate the cost of making changes to implement a new feature in an established code base and be confident I'll be about right. I can estimate, but I can't justify it or explain it. It's too complicated.

In answer to your final question, I would strongly advise younger programmers to start as high on the layer cake as they possibly can, and only dive down to the lower layers as the need and desire provided they have the impetus. My preference is for languages with no loops, little or no branching, and explicit state. Lisp and Haskell come to mind. In practice I always finish up with C#/Java, Ruby, Javascript, Python, and SQL because that's where the communities are.

Final words: complexity is the ultimate enemy! Beat that and life becomes simple.

No comments:

Post a Comment