AllgemeinBetreuer: Florian Quinkert Beginn: as soon as possible Dauer: 3 months Weitere Details:     
Google handles more than 3.5 billion search queries per day . Most of the search queries consist of simple keywords or simple operators like AND or OR. However, it is possible to use advanced operators, e.g., to find only files of a certain file type or pages with a certain string in the title. A malicious user can use these operators to build specifically crafted search queries in order to find vulnerable systems or sensitive information like passwords or social security numbers. This technique is called Google hacking or Google dorking . In 2005 a tool called Google Hack Honeypot  was developed in order to provide a detection mechanism for this technique. However, the tool was only maintained until 2007 so that there is no up-to-date honeypot for Google dorks available. Websites like exploit-db  collect google dorks and add frequently new google dorks to their database. Since new and improved web applications require new Google dorks, it is very likely that in the future new Google dorks will appear regularly. This thesis aims at gaining a better understanding of the Google dork usage. Therefore, a tool to automatically harvest Google dorks from websites like exploit-db shall be developed. Afterwards, the extracted Google dorks shall be used to automatically create Google dork honeypots, which are basically systems that are found when the corresponding Google dork is used in a Google search. Glastopf  is a well established web application honeypot. It is recommended to rest the Google dork honeypot on Glastopf in order to have a solid basis.
- Good programming skills, preferably in Python or some other high-level language.
- Being comfortable to work with Linux-based systems.
- Experience with tools like git, vagrant and/or docker.