Getty Images

Researcher says deleted GitHub data can be accessed 'forever'

Truffle Security researcher Joe Leon warned GitHub users that deleted repository data is never actually deleted, which creates an "enormous attack vector" for threat actors.

Truffle Security warned that anyone can access repository and fork data on GitHub even after it's deleted, a feature that GitHub confirmed was normal for the platform.

In a blog post published on Wednesday, Joe Leon, security researcher at Truffle, detailed how deleted and private repository data stored on GitHub can be accessed by anyone. More alarmingly, Leon stated the potential attack vector was designed that way.

Leon demonstrated how he was able to fork a repository, commit data to it, delete the fork and then access the so-called deleted commit data through the original repository in less than one minute. This could pose a threat, especially if GitHub users are unaware that such data can still be accessed.

"You can access data from deleted forks, deleted repositories and even private repositories on GitHub. And it is available forever. This is known by GitHub, and intentionally designed that way," Leon wrote in the blog. "This is such an enormous attack vector for all organizations that use GitHub that we're introducing a new term: cross fork object reference (CFOR). A CFOR vulnerability occurs when one repository fork can access sensitive data from another fork (including data from private and deleted forks)."

Leon added that if users fork a repository, the commits that contain sensitive data can still be accessed. Therefore, any public repository with at least one fork "may be accessible forever," he said. Leon tested private repositories as well and found another troublesome pattern. He demonstrated how anyone could access commit data from a private internal version because the repositories often have a public version as well.

"Unfortunately, this workflow is one of the most common approaches users and organizations take to developing open-source software. As a result, it's possible that confidential data and secrets are inadvertently being exposed on an organization's public GitHub repositories," the blog said.

Leon listed several implications of the inherently designed feature. He warned that as long as one fork exists, any commit to that repository network will exist on GitHub permanently and added that most GitHub users don't understand how repositories work, which poses significant security concerns.

"This further cements our view that the only way to securely remediate a leaked key on a public GitHub repository is through key rotation," the blog said.

Aside from the tests Leon conducted, he stressed that there are additional ways deleted and private repository data can be accessed by anyone.

TechTarget Editorial contacted Truffle Security regarding the feature and potential risks it poses. "We believe that although the issues independently were known and publicly documented, when taken together, the vast majority of GitHub users were likely unaware of the risks and dangers," Truffle CEO Dylan Ayrey said in a statement. 

TechTarget Editorial also contacted GitHub regarding the research, and a spokesperson provided the following statement: "GitHub is committed to investigating reported security issues. We are aware of this report and have validated that this is expected and documented behavior inherent to how fork networks work. You can read more about how deleting or changing visibility affects repository forks in our documentation."

Truffle's research is the latest report to reveal potential security weaknesses in the popular developer platform. In April, New York University professor Justin Cappos discovered a vulnerability in GitHub that exposed sensitive security reports.

While there are no reports of compromised deleted repositories, threat actors have often targeted GitHub as an attack vector. For example, in April, software vendor Checkmarx published research that showed how threat actors leveraged GitHub for supply chain attacks. The campaign tricked developers into downloading malicious code by manipulating the search function.

Arielle Waldman is a news writer for TechTarget Editorial covering enterprise security.

Dig Deeper on Application and platform security