For any company data confidentiality is a matter of high importance. Leak of clients’ usernames and passwords or loss of system files may result in great financial expenses and destroy the reputation of the most trustworthy organization. The article by Vadim Kulish, security testing engineer.

Considering all potential risks, companies spend big money to embed latest security technologies to prevent unauthorized access to the valuable data.

But have you ever given a thought that besides sophisticated hacking attacks there are simple ways to uncover the files that weren’t effectively protected. In this article we’ll focus on Google search operators that can be used to get more specific search results or to detect sensitive information.

Let’s start from the beginning.

One can hardly imagine Internet surfing without search systems as Google, Bing and others alike. Search engines index vast amount of web pages to make them available for surfing.

Google search operators

When you search in Google, you can include search operators in the entry field to narrow or broaden your search. The most commonly used of them are the following:

* site: returns results from certain sites or domains

E.g.: If you enter you’ll get all info in Google related to the website.

* filetype: searches for exact file type

E.g.: The entry filetype:php will provide you with the list of php-files from the website

* inurl: searches for specific text in the indexed URL

E.g.: The entry inurl:admin will search for the administration panel on the website.

* intitle: searches for query terms in the page’s title

E.g.: The entry intitle:”Index of” will return documents from the website that mention the word “index of” in their titles.

* cache: searches in Google cache

E.g.: will show Google’s cached version of the page instead of the current one.

Unfortunately, web crawlers are not able to determine the type and degree of information confidentiality. Therefore, they equally treat blog articles, which are published for wide audience, and database backup copy stored in the web server root directory and not intended for third parties view.

Thanks to this feature and using the search operators, hackers manage to detect vulnerabilities of web resources, information leaks (backup copies and text of the web applications errors), hidden resources, such as opened administration panel without authentication and authorization mechanisms embedded.

Types of information that can be detected by search engines and may be potentially interesting to hackers include the following:

* Third-level domains of the explored resource

Third-level domains can be found using the keyword “site:”. For example, the query site:. * will return all domains of the third level of the website Such requests enable to detect hidden management resources, release management systems, as well as other applications with the web interface.

* Hidden files on a web server

When searching, you may happen to view various parts of the web application. To find them, use the query filetype:php It will return previously unavailable functionality in the application, as well as other information about the app.

* Backup copies

Backup copies may be found with the filetype: keyword. Usually backup copies are stored using the following file extensions: bak, tar.gz, sql. For instance: site:. * filetype:sql. Backup copies often contain logins and passwords of the admin interfaces, as well as user data and source code of your website.

* Errors of the web application

The text of the error may contain various data about the app’s system components (web server, database, web application platform). This information is always very interesting to hackers because it allows to find out more about the target system and to enhance the attack. For instance: site: “warning” “error”.

* Login and password

Web application cracking may reveal big amount of users’ sensitive data. The request filetype:txt “login” “password” will allow you to find files with usernames and passwords. Likewise, you can check whether your email or any account has been hacked. Just make a request filetype:txt “user_name_or_email”.

The combinations of keywords and search strings used to detect confidential information are commonly named Google Dorks.

Google has collected them in the public Google Hacking Database. Now any company representative, whether CEO, a developer or a webmaster, may learn about what type of sensitive data was detected with this or that query. All dorks are broken down by categories to make the search more comfortable.

Google Dorks leaving mark in the history of hacking

Finally, learn about the cases of how Google Dorks helped the attackers to get access to sensitive but poorly protected information:

#1. Leakage of confidential documents on the bank’s website

During the official bank site security analysis a large number of pdf-documents was detected. All documents were found with a query “site:bank-site filetype:pdf“. Interestingly, it turned out that the contents of documents represented plans of the bank branch premises across the country. For sure, that information would be very interesting to bank robbers.

#2. Cardholders’ data search

Very often breaking online stores attackers gain access to the users payment data. To make this info public, violators use public services that are indexed by Google. Sample query: “Card Number” “Expiration Date” “Card Type” filetype:txt.

With all this in mind, we recommend that you check the security of your website to prevent dubious activities related to your resource.

But we advise you to look beyond the basic checks. Address security testing specialists to conduct comprehensive analysis of your software product. After all, it’s better and cheaper to prevent data loss than repair the damage incurred.

Genislab Technologies

NexGeneration complete end-2-end software testing & modern development operations tooling & solutions

Do you want to discuss your testing requirements with us? please don’t hesitate to hit the contact us button below, and we will get back to you at our earliest..