Objective Truth: Lines of Code Should Be 80-Characters Long (2024)

I think lines of code should be no more than 80 characters long. Just abouteveryone disagrees with me, which is how I know I’m right.

Kidding aside, this is my argument:

The optimal line length for reading is 45-75 characters.
We should optimize code for reading, not writing.
On average, code is offset by ~4 chars of whitespace.
Hence, we should have at most ~80 character line limits.

This is not just personal preference. It is based on experimentally-verifiedfeatures of our psychology, as well as usage patterns in 1.8 trillion lines ofopen source code. An 80 character line limit willproduce better software than a 100, 120, or greater character limit – softwarethat’s easier for engineers to read and quicker for them to understand.

The Argument Against 80 Character Lines

These days it’s an unpopular¹ view that lines of code should be at most 80characters long. Many see it as an outdated artifact of the days when code waswritten on punchcards. No less than the immortal Linus Torvalds holds thisview. As he gently puts it:

Excessive line breaks are BAD. [. . .] [M]any of us have long long [sic] sinceskipped the whole “80-column terminal” model. [. . .] Longer lines arefundamentally useful. My monitor is not only a lot wider than it is tall, myfonts are universally narrower than they are tall. Long lines are natural. [. ..] I do not care about somebody with a 80x25 terminal window getting linewrapping. [. . .] People with restrictive hardware shouldn’t make it moreinconvenient for people who have better resources.[source]

The prevailing argument for abandoning the historic 80-character limit seems toboil down to this:

The only reasons we had for 80-character limits no longer apply.
We have big monitors now, and long lines are useful on big monitors.
If there are no good reasons to have such a limit, and good reasons to not have it, then we should not have it.
Therefore, we should not have an 80-character limit.

Call this “Linus’ Argument”. It is avalid argument – if itspremises are true then its conclusion must be true as well – so it should betaken seriously. But are all of its premises true?

Against Linus’ Argument

No. The first premise is false. There are strong reasons to have 80 characters –or even fewer – as one’s line limit. And they are as strong today as they werein the days of punch cards.

Against the first premise there are decades of research on the psychology ofreading. And it directly contradicts what Torvalds says about long lines being“natural”. Thisarticle is a very good summary.

We know that people don’t read each individual word. Instead, they use theirfoveal (or central) vision to focus on a word, while using their peripheralvision to find the next spot on which to focus. [. . .] We also know that peopledon’t fixate on every word, but tend to skip words (their eyes take littleleaps, called “saccades”) and fill in the rest. [. . .] [W]e know that readersanticipate the next line while moving their eyes horizontally along a line; so,their eyes are drawn down the left edge of the text.[source]

These properties of our psychology make long lines particularly problematic. Ifwe read by focusing on every few words and inferring what’s in between, thereis a real cost to having long lines which pull our eyes away from the rest ofwhat we’re reading. And if we find the next spot on which to focus usingperipheral vision, we pay a price for long lines that put the next focus pointoutside of the periphery.

As the article continues:

“Reading a long line of type causes fatigue: the reader must move his head atthe end of each line and search for the beginning of the next line.” [. . .] Ifa casual reader gets tired of reading a long horizontal line, then they’re morelikely to skim the left edge of the text [potentially missing importantinformation at the end of the line]. If an engaged reader gets tired of readinga long horizontal line, then they’re more likely to accidentally read the sameline of text twice (a phenomenon known as “doubling”).[source]

In summary, long lines:

Cause greater fatigue
Result in more reading errors
Make it easier to miss important information
Make it harder to find desired information
Waste time (via "doubling")

The downsides of long lines were famously observed in a pair of eye-scanningexperiments done on Google’s search interface². In the experiments,people were given the Google search results displayed in either its 2005 or 2015interface and asked to find a specific piece of information.

Here are the resulting heatmaps showing where on the page people looked:

(Image Source: Google Golden Triangle Research: Eye Tracking Study; TheEvolution of Google’s Search Results Pages & Effects on User Behavior)

The people given the 2005 version of the interface initially wasted time readingthe entire lines at the top, then began quickly skimming the left hand column.In the 2015 version, with a shorter line length, people were able to morequickly assess whether or not a given result was what they were looking for andmove on if it wasn’t.

The eye tracking demonstrated that people scanned the entire page quickly downthe left side to examine Google’s updated categories. [When Google reduced theline length on their search results] [t]here was significantly less horizontalscanning compared to the 2005 study. In fact, the amount of time it tookparticipants to find a desirable result decreased from the 05’s study average14-seconds to the new study average 8-9 seconds.[source]

It took people fully ~40% less time to find what they were looking for withshorter lines. How short? The lines in Google’s 2015 interface wereapproximately 85 characters long.

I think it’s obvious that we want to optimize for each of (A)-(E) above when itcomes to software engineering. We want to make our code as easy as possible toread correctly. We want engineers to find what they’re looking for as quickly asthey can and to avoid unnecessary errors. In all of these respects, long linesare bad.

So it’s simply not true that the “only reasons we had for 80-character limits nolonger apply”. As such, the argument is valid but unsound, and its conclusiondoesn’t follow.

The Positive Case for 80-Chars

So far I’ve made a strictly negative case against “long” lines. But, as theprevious section might suggest, I think we can go further and actually makea positive argument for 80 character line limits. It goes like this:

The optimal line length for reading is 45-75 characters.
We should optimize code for reading, not writing.
On average, code is prefixed by ~4 chars of whitespace.
Hence, we should have at most ~80 character line limits.

If you have doubts about the first premise, spend 2 minutesgoogling “optimal linelength”. You will quickly find that in Typography – the domain thatactually studies this and performs eye-scanning experiments – the resoundingconsensus is 45-75 characters. For instance:

65 characters (2.5 times the Roman alphabet) is often referred to as theperfect measure. Derived from this number is the ideal range that alldesigners should strive for: 45 to 75 characters (including spaces andpunctuation) per line for print.[source]

Anything from 45 to 75 characters is widely regarded as a satisfactory lengthof line for a single-column page set in a serifed text face in a text size.The 66-character line (counting both letters and spaces) is widely regarded asideal. [source]

The length of a line should be comfortable to read: too short and it breaks upwords or phrases; too long and the reader must search for the beginning ofeach line. If you are uncertain about line length, a good rule of thumb is toset the type with 35 to 70 characters per line.[source]

The optimal line length for body text is 50–75 characters[source]

[I]t is widely accepted that line lengths fall between 45 and 75 charactersper line (cpl), though the ideal is 66 cpl (including letters and spaces).[source

This should speak for itself.

My reasons for the second premise are straightforward. Code is written once butread many, many times. We read code far more than we write it. In a long-livedcodebase, for example, code that is written over the course of a couple hoursmight be read-only for the next decade. And usually the people who read it arenot the people who write it – meaning we want to make it as easy as possible forpeople to discover (rather than remember) its meaning.

The Problem of Indentation

At this point there is an obvious objection: most code is prefixed byindentation. And in some languages (e.g. python) indentation actually has ameaning, so our line limits have to make room for it. Hence, even if it’spsychologically optimal to have short lines, we need to account for the factthat most code is not going to be able to start at the left edge of the screen.

This cannot be denied. But how much whitespace on average prefixes a line ofcode?

The average matters because we shouldn’t be purists when it comes to line limits. A line of codecan exceed the limit if there is good reason. What we want is just a rule thatis going to work in the vast majority of cases. With that in mind, it seems thata reasonable line limit would be equal to the high end of the optimal readingrange – roughly 75 characters – plus the average amount of indentation in thelanguage.

As it happens, it’s possible to get a reasonable approximation of the amount ofindentation in professional codebases for a given language. In 2016, Githubreleased a set of public BigQuery tables to allow people to easily write SQLagainst the code in public GitHub repositories.

I wrote aqueryagainst this data for the most popularlanguagesto find out how much indentation they have on average perline.³ Here are some of the results:

Language	Avg Line Length*	Avg Indent.	Indent. (50th%)	Line Length (50th%)	Line Length (99th%)
sql	54.8	1.2	0	45	204
powershell	41.8	4.5	4	36	156
python	40.5	6.4	4	37	106
scala	40.3	3.7	3	37	117
html	40.3	4.2	2	28	196
c#	38.6	8	8	32	139
java	37.9	4.5	4	35	117
r	36.7	3	1	31	129
swift	36.6	4.6	4	29	150
rust	35.5	5.5	4	32	97
ruby	35.3	4.9	4	30	115
c++	35.3	3.5	2	32	107
objective c	34.9	1.1	1	32	87
php	33.4	4.3	4	27	128
shell	32.4	1.5	0	26	120
go	31.4	1.2	1	25	114
javascript	30.8	4.8	4	25	109
c	29.1	1.4	1	26	78
css	24.7	1.6	1	20	107

* = includes indentation

The query excluded things like binaries, auto-generated files, build-artifacts,minimized files, data files, non-text files, empty lines, and anything thatdidn’t look like it was written, or intended to be consumed, by a human.

After all of these filters were applied, the query ran against 1.78 trillionlines of open source code in the 19 mostpopular languages, and found that there are on average only 4.1 spaces ofindentation per line and ~35 characters per line total.

Interestingly, in no case did the average amount of indentation for a languagerequire that that language have greater than 80 character line limits in orderto accommodate the 45-75 character ideal range prescribed by the psychology ofreading.

Put another way: indentation does not take up all that much space on most lines.We can use indentation and still comfortably remain within 80 charactersthe vast majority of the time.

The query did reveal, however, that an 80-character limit would be disruptive.Among the most popular languages, only one (C) had a p-99 line length that wasunder 80 characters. This means that in most languages, at least 1% of the linesin open source code are longer than I’m arguing they should be. If any of theseprojects were to adopt an 80-character line limit, >1% of the lines would haveto change.

This isn’t surprising. Again: programmers like to write long lines. It is thedefault preference in the age of big monitors. I never said that an 80 characterlimit wouldn’t have consequences. But I want to emphasize that thoseconsequences aren’t the end of the world. I have written code in 9 of the 19most popular languages professionally – more than that if you include hobbyprojects. In every single one of them without exception it has been possible towrite expressive, functional code in under 80 characters per line. Even SQL,which had far-and-away the highest p99 line length, can be comfortably writtenwithin these limits. As evidence, I submit the veryquery which produced thedata above.

Objections

Objection 1. Reading code is just not the same thing as reading prose in abook or even reading Google search results. It’s not a natural language. Sothe lessons learned from reading natural language do not apply here.

Fair. Code is obviously not prose. But there have also been eye scanning studiesdone on programmers4 (albeit not as many). And they validate what was said above– namely, programmers likewise jump from point to point when they read code,inferring what’s in between, and use their peripheral vision to guide them indetermining where to focus next.

This is a representativestudy, with full text availablehere (at the time I am writing). They include one figure of the IDE with a“representative [eye] scan-path superimposed”. Here it is:

Notice how the programmer’s eyes jumped around. When reading the code, theyfocused on discrete points in the line – often only one point per line: thecenter. This strongly suggests that inference and peripheral version were beingused to fill in the rest of the information: two processes that long linesdisrupt.

It’s not surprising that programmers would read code the same way people readeverything else. This is just the way our eyes and brains work. We have a small nervecluster in our retina that onlyallows us to focus narrowly. With this limitation, the most efficient way toread anything is to jump from point to point and fill in theblanks. It’s just a consequence of our biology.

Objection 2. When code is stretched out vertically it’s actually difficult toread/understand.

I admit it takes some getting used to if you’re accustomed to writing code thatcan be longer. But once you are used to it, it actually usually helps inunderstanding the code – especially when you’re reading code that you did notwrite (which, again, is what we should be optimizing for).

Just consider these examples. Which one is easier to understand?

LENGTH(REGEXP_REPLACE(REGEXP_REPLACE(c.content, r'\t', ' '), r'(\n|\r|\v){2,}', '\n')) as cleaned_content,

VS. the same code broken up so it’s under 80 characters per line:

LENGTH( REGEXP_REPLACE( REGEXP_REPLACE(c.content, r'\t', ' '), r'(\n|\r|\v){2,}', '\n' )) as cleaned_content,

I’ll bet the second one was simpler to understand.

What is an input of whatin the first one? You can probably figure it out, but it’s emphatically notobvious if you didn’t write it. The second one is much more forthcoming aboutit* meaning.

I think the problem here is that most engineers are optimizing for writing code.This is completely natural. When you’re writing the code, your intent isperfectly clear to you. But this just isn’t the case for anyone reading yourcode. We need to be thinking more about the experience of those people.

Objection 3. Vertical code just wastes space on my screen.

Then you don’t have enough panes open in your editor! Narrow code on a widemonitor is amazing. You can often have 3 or 4 vertical splits in your editor (inVim I sometimes have multiple splits for different parts of the same file),allowing you to quickly consult other parts of the codebase. It is incrediblyhelpful.

Conclusion

I think that the three premises of my argument are true, and that its conclusionfollows. We should have 80-char line limits even now, in this brave new world ofreadily available, wide screen monitors. Even if it is more fun or moreconvenient to write code with lines that are longer than 80 characters, thebenefits of shorter lines on reading far outweigh this.

Having such limits does not mean that they should never be exceeded. It justmeans that the burden of proof should be on anyone who wants to exceed them. Thelinter should yell and we should have to explain ourselves.

There certainly are instances in which exceeding an 80-character limit isjustified. Having 80 character limits just means taking seriously the factsabout our psychology and the languages we use, and then writing code in a waythat optimizes for them.

Footnotes

[1] Some good examples of the high-quality discussion this topic has seen:1,2, 3

[2]For more info, see: 1,2,3

[3]You can view and re-run my query here. If you don’t alreadyhave one, you will have to create a (free) Google Cloud project to do this.Running the query costs about $30, which can usually be donefor free with a trial account. If that’s too much work, I also put the code in apublic gist,and saved it to the repo for thisblog.

[4]Here are some eye tracking studies on programming if you’re interested inreading further:1,2,3