SoatDev IT Consulting
SoatDev IT Consulting
  • About us
  • Expertise
  • Services
  • How it works
  • Contact Us
  • News
  • December 13, 2023
  • Rss Fetcher

The diff utility compares files by lines, which is often what you’d like it to do. But sometimes you’d like more granularity.

For example, supposed we want to compare two versions of Psalm 23. Here are the first three verses in the King James version:

The Lord is my shepherd; I shall not want.
He maketh me to lie down in green pastures:
he leadeth me beside the still waters.
He restoreth my soul:
he leadeth me in the paths of righteousness
for his name’s sake.

And here are the corresponding lines from a more contemporary translation, the English Standard Version:

The Lord is my shepherd; I shall not want.
He makes me lie down in green pastures.
He leads me beside still waters.
He restores my soul.
He leads me in paths of righteousness
for his name’s sake.

Save these in two files, ps23.kjv and ps23.esv. If we run

diff ps23.kjv ps23.esv

we get

2,5c2,5 < He maketh me to lie down in green pastures: < he leadeth me beside the still waters. < He restoreth my soul: < he leadeth me in the paths of righteousness --- > He makes me lie down in green pastures. > He leads me beside still waters. > He restores my soul. > He leads me in paths of righteousness

This says that the two files differ in lines 2 through 5; the first and last lines are identical. The output shows lines 2 through 5 from each file but doesn’t show how they differ.

To see more fine-grained differences, such as changing maketh to makes, we can run the version of diff that comes with git.

If we run

git diff --word-diff ps23.kjv ps23.esv

we can compare the files by words rather than by lines. This produces

diff --git a/ps23.kjv b/ps23.esv index b90b858..be2a1a8 100644 --- a/ps23.kjv +++ b/ps23.esv @@ -1,6 +1,6 @@ The Lord is my shepherd; I shall not want. He [-maketh-]{+makes+} me[-to-] lie down in green [-pastures:-] [-he leadeth-]{+pastures.+} {+He leads+} me beside[-the-] still waters. He [-restoreth-]{+restores+} my [-soul:-] [-he leadeth-]{+soul.+} {+He leads+} me in[-the-] paths of righteousness for his name's sake.

The colors help make the test more readable, assuming you can see the difference between red and green. I assume the color scheme is configurable. But the text is readable without the color highlighting. For example, in the first line we have

[-maketh-]{+makes+}

which means we remove the word maketh and add the word makes.

We can compare the files on an even finer level, comparing by characters rather than words. For example, rather than saying we need to change maketh to makes the software can say we need to change the th ending to s. We can do this by running

git diff  --word-diff-regex=. ps23.kjv ps23.esv

The option --word-diff-regex=. says to use the regular expression . to indicate word boundaries. Since the dot matches any character, this says to chop the lines into individual characters.

diff --git a/ps23.kjv b/ps23.esv index b90b858..be2a1a8 100644 --- a/ps23.kjv +++ b/ps23.esv @@ -1,6 +1,6 @@ The Lord is my shepherd; I shall not want. He make[-th-]{+s+} me [-to -]lie down in green pastures[-:-] [-h-]{+.+} {+H+}e lead[-eth-]{+s+} me beside [-the -]still waters. He restore[-th-]{+s+} my soul[-:-] [-h-]{+.+} {+H+}e lead[-eth-]{+s+} me in [-the -]paths of righteousness for his name's sake.

As before we have square brackets to indicate what to remove and curly braces to indicate what to add, but now we’re removing and adding letters rather than words.

We can get a more compact display of the differences if we rely on color alone, by adding the --word-diff=color option.

git diff  --word-diff=color --word-diff-regex=. ps23.kjv ps23.esv

produces the following.

diff --git a/ps23.kjv b/ps23.esv index b90b858..be2a1a8 100644 --- a/ps23.kjv +++ b/ps23.esv @@ -1,6 +1,6 @@ The Lord is my shepherd; I shall not want. He makeths me to lie down in green pastures: h. He leadeths me beside the still waters. He restoreths my soul: h. He leadeths me in the paths of righteousness for his name's sake.

Equivalently, we can combine the two options

--word-diff=color --word-diff-regex=.

into the one option

--color-words=.

that specifies the word separation regular expression as an option to --color-words.

This may be the most convenient way to see the differences, provided you can distinguish the colors, and don’t need to use the plain text programmatically. Without the colors, makeths, for example, becomes simply makeths and we can no longer be sure what changed.

The post Fine-grained file differences first appeared on John D. Cook.

Previous Post
Next Post

Recent Posts

  • Lawyers could face ‘severe’ penalties for fake AI-generated citations, UK court warns
  • At the Bitcoin Conference, the Republicans were for sale
  • Week in Review: Why Anthropic cut access to Windsurf
  • Will Musk vs. Trump affect xAI’s $5 billion debt deal?
  • Superblocks CEO: How to find a unicorn idea by studying AI system prompts

Categories

  • Industry News
  • Programming
  • RSS Fetched Articles
  • Uncategorized

Archives

  • June 2025
  • May 2025
  • April 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023

Tap into the power of Microservices, MVC Architecture, Cloud, Containers, UML, and Scrum methodologies to bolster your project planning, execution, and application development processes.

Solutions

  • IT Consultation
  • Agile Transformation
  • Software Development
  • DevOps & CI/CD

Regions Covered

  • Montreal
  • New York
  • Paris
  • Mauritius
  • Abidjan
  • Dakar

Subscribe to Newsletter

Join our monthly newsletter subscribers to get the latest news and insights.

© Copyright 2023. All Rights Reserved by Soatdev IT Consulting Inc.