[UA-EAI] Check code automatically for UA issues? GitHub Code Scanning, CodeQL
Jim DeLaHunt
list+uasg at jdlh.com
Sat May 30 04:40:10 UTC 2020
UA EAI WGs:
As I mentioned in our meeting on 26. May, I found out recently about a
capability which opens the possibility for scanning code automatically
for UA issues. It would take a technical effort to adapt existing
technology for UA purposes, and a measurement campaign to apply the UA
scanning to the repositories at GitHub and elsewhere. But if we could
do it, we might be able to multiply our impact on open-source code.
GitHub announced a service called Code Scanning, last week at their
GitHub Satellite conference. Code Scanning is a service for running
automated queries which look for security vulnerabilities in source
code. They plan to run these queries on pull requests, and periodically
on the master branch, of open source repositories in GitHub. The queries
are written in a language called CodeQL. This language treats source
code as data to be parsed and queried.
Presently, they have scans for security vulnerabilities and secrets
disclosure. For example, a query can read the source code, and detect
that a value is accepted as user input by module A, passed through
module B, then used in a database operation in module C, without being
sanitised against malicious input. Or a query can look at a password
parameter passed to a system API, and determine that the value of that
password parameter is stored in plain view in the source code.
If CodeQL queries can do that sort of detection, then it seems to me we
might be able to get queries written that detect which URL or domain
name class a Java program uses. Or we might be able to detect that an
email address is compared to a regular expression. Or perhaps other
UA-obstructing behaviour. Might the Technology WG want to take on the
task of figuring out how to get the queries written?
If we have such queries, perhaps we could persuade GitHub to scan for UA
problems in addition to security and secrets problems. Or at the very
least, we can post the queries so that projects hosted at GitHub and
elsewhere could run them of their own accord. Might the Measurement WG
want to figure out how to plug this into the GitHub code scanning service?
See a 23-minute introduction video at
<https://githubsatellite.com/schedule/#stopping-vulnerabilities-at-the-source>
(or <https://youtu.be/58N0_0HCDPE>).
News article "GitHub Code Scanning aims to prevent vulnerabilities in
open source software"
<https://www.helpnetsecurity.com/2020/05/08/github-code-scanning/>
<https://lgtm.com/> is I believe the originator of the CodeQL
technology, before GitHub acquired them.
<https://lgtm.com/help/lgtm/about-lgtm> is a starting point to learn
about CodeQL.
If there is interest in talking about this possibility at the WG
meetings, I am happy to share what I know so far. (But I don't know
much, and most of it is already in this email.) I have already shared
this news with the UA Technology and UA Measurement working groups.
Best regards,
—Jim DeLaHunt, software engineer, Vancouver, Canada
--
--Jim DeLaHunt, jdlh at jdlh.com http://blog.jdlh.com/ (http://jdlh.com/)
multilingual websites consultant
355-1027 Davie St, Vancouver BC V6E 4L2, Canada
Canada mobile +1-604-376-8953
More information about the UA-EAI
mailing list