The Open Source falacy¶
Recently have I seen chats about the search of offices from Twitter (No I won't call it the other "name") in france.
And while I'll not go into details about why this was a justified move, people have been joking around that "if they want to know how the Algorythm works, they just have to check the source on GitHub!" not realizing that open source doesn't mean 100% transparency.
What is open source¶
In laymens terms, open source means, that the code of a website, software or similar is made available to the public. How it is done depends on the individual owning the code.
Many share their code publicly through so called "Code Forges", public Sites that provide hosting for the code and often times also basic Git-compatible functionalities like committing and pushing code, pull requests, etc.
Famous examples of such services and sites are GitHub, GitLab, Codeberg, BitBucket and many more.
However, not every software shares their code publicly for everyone to view. Open source just demands the code to be provided, not even for free for some licenses. This means, a software can keep their source hidden and only provide it on request (and/or after a payment).
In the end does it highly depend on the open source license used and the individuals behind the software.
Open Source ≠ Transparency¶
One core issue I often see with people, is how they believe that an open source software is the same as the code that is shared.
This couldn't be further from the truth.
What is shared in public and what you get to run is not necessarely the same.
As an example can I share code of some Java programm, that when executed, gives you random compliments in the console. However, the actual jar file I share does instead give you random insults.
The software I shared with you is not automatically the source itself. Modifications are always a possibility. That's why malware is a thing and can happen frequently.
Another, much better example, would be a software running on a site or service you yourself don't have real control over.
In such a situation can the source show everything to look good, while the software on the backend is running different stuff. This is where Twitter comes into play again. They can share the source code of their algorythms, but it won't change the fact that the server with the actual software running may use something completely differently.
Validation¶
Of course, one way to avoid being misled is to validate the software against the source, but this comes with its own problems.
The probably most reliable way is comparing file hashes from the source-compiled software and the downloaded ones. If they are the same, can it be assumed that they are probably the same and if they differ was the download most likely tampered with.
Important here is, that the source used needs to be from the date as the download, as any changes made since the version was published, could influence the hash itself.
Unfortunately would this system not work for externally run software. And there is little to no methods of validating that what is used, is the same as the source.
Conclusion¶
A project being open source is great! It means you can check the source for what it does and - depending on the licenses applied - make your own modifications to it.
But it does not mean that any software you download may be the exact same.
The only true way to remain safe and be sure to have the same software as the source, is to create it from the source directly yourself. Only then can you be guaranteed that what is in the source, is also in the software.
But for externally run software is this a lot more problematic, because unless the system it is for allows own instances (See for example Mastodon or hosting your own CI or Git Forge) to be used interchangeably, are you required to use it and hope that it is what they promise you the source shows.
And that's why I don't think that what Twitter shares as source of their algorythm, is necessarely their actual algorythm in use.
Stay safe everyone!
Footnotes
Comments
Comment system powered by Mastodon.
Leave a comment using Mastodon or another Fediverse-compatible account.