Recently I stumbled upon a simple yet powerful tool from Lido:
It's a CLI tool called Diffyscan. What is it for?
It lets you compare source code of verified smart contracts in Etherscan against a GitHub repository.
Is it a must-have for security researchers?
Let me show you how we used it in our latest security spotcheck of Lido v2.
We were starting out a security spotcheck for the deployed version of Lido V2 on Ethereum mainnet.
On the first day, I needed to define the scope by collecting all relevant contracts and their addresses.
Luckily, Lido publishes all their deployed contracts with addresses in a single place. This was a great starting point.
I crawled all addresses, then opened each one in Etherscan. Soon I realized that the "Code" view in Etherscan wouldn't be a sensible choice. I wanted to read huge contracts, take down notes, etc. I was missing an IDE.
How could I move from Etherescan verified source code in +10 addresses to my local testing environment?
I was dealing with lots of addresses with large contracts, libraries, external dependencies, etc. Needless to say, manually copying them to local files was not a viable option. I still had one alternative though.
Use the source code in Lido's GitHub repository!
However, could I find a public version in the repository that matched the one deployed?
If I could, then it'd be just a matter of cloning the repo and checking out the commit that matched the deployed version. But...
Well, it wasn't easy. At first I wasn't able to find the exact commit documented anywhere. I also didn't know whether the tagged versions of the repo corresponded to deployments.
As I searched, I ran into some promising branches that looked like could match. Which one should I choose?
It would've been a looong guessing game, hadn't I found Diffyscan.
It advertised exactly what I needed. And it was developed by Lido devs.
How could I not give it a try?
The chosen branch (
v2-upgrade-cleanup) was just a first guess. Spoiler alert: it ended up being ok. We also tried it with
v2.0.0-rc.2, and it was ok too.
As of the dependencies, I left the default ones. Since Diffyscan comes from Lido devs, it seems they listed the exact dependencies they need for their contracts.
Finally, I ran
python3 main.py in the root dir of Diffyscan's repository.
Upon running, Diffyscan loaded the config, and started crawling the contracts.
For each address I set in the config it retrieved all files from Etherscan, tried to locate them in the repository, and diff'ed them.
In this case, it started with the contract at
0x17144556fd3424EDC8Fc8A4C940B2D04936d17eb, and retrieved 37 files that would compare against the repository.
Once done, it logged whether the file was found or not, how many differences it spotted, and a local path to a pretty HTML digest with the actual diff.
As you can see below, it also works on the dependencies!
Then it compared the Lido-specific files.
Afterwards, it continued the process with the next address I had specified in the config.
And on and on went Diffyscan. Until it found a difference!
Diffyscan placed the HTML report in a local folder it had created, called
digest. It included subfolders for each run and address.
The HTML showed me the two files side-by-side, highlighting the exact place of the difference:
Well, false alarm! 😅
It seems the file just had an extra whitespace. For larger diffs I'd had seen bigger green and red portions of the files highlighted.
Once the whole run was done, I could see similar reports for all scanned address:
There was also a
logs.txt file including all logs, which could have been useful had I needed to move away from the terminal.
Diffyscan is a great starting point. It's simple, fast, and most of all, gets the job done. It does require some luck though.
If you don't know which version of the GitHub repository you should use, nor cannot at least narrow down your options, then it wouldn't help much. You'd have to iterate over every possible version of the code, and compare that to the verified source code.
On top of it, the config requires you to list dependencies. Same problem as before. If you don't know the dependencies used, nor their exact versions, you're in trouble.
I think Diffyscan works great in at least two scenarios.
First, if you know both the addresses and the exact version in GitHub, it helps you verify that the code matches. This is useful to verify claims of developers, and quickly spot potential differences, if any.
And second, if you have narrowed down versions to just a few, but still unsure about which one matches, you can run the tool a few times to find the one in a couple of minutes.
Have you used Diffyscan? What are your impressions? Otherwise, if you have an alternative to diff smart contracts against mainnet, mind sharing?