D2
Администратор
- Регистрация
- 19 Фев 2025
- Сообщения
- 4,380
- Реакции
- 0
Googlebot/2.1
Google is the most popular and widely used search engine in the world. One of the best things about it, is Google dorking, also known as Google hacking, which is a technique of using specialized search operators and keywords to find information that is not easily accessible through regular web searches. It is like a “grep” command in linux that you use to find exactly what you want.Google dorking can be used for various purposes, such as:
Finding...
- sensitive information, such as usernames, passwords, credit card numbers, email addresses, etc., that are exposed or leaked on the web
- vulnerabilities in websites and web applications, such as SQL injection, XSS, directory traversal, etc.
- confidential documents, such as PDFs, Word files, Excel sheets, etc., that are not intended to be “publicly” available.
- hidden or alternate versions of web pages, such as cached, archived, or translated pages.
- specific file types, such as images, videos, audio, etc., that match certain criteria.
- information about a specific domain, such as subdomains, IP addresses, DNS records, etc.
Contents
- What is Google Index(ing)?
- Examples of Google Dorking
- Interesting part
- Have I found anything myself?
What is Google Index(ing)?
Google indexing is the process of crawling and storing web pages in the Google database, also known as the Google index. Google uses a software program called Googlebot to visit and download web pages and follow the links on them. Googlebot then extracts the content and meaning of each page and adds it to the Google index. After the data is added, we can use dorking (keywords) to search for what we need.Some of the common operators and keywords used in Google dorking are:
intitle: This operator searches for web pages that contain a specific word in the title tag. For example, intitle:"index of" will return web pages that have “index of” in their title, which indicates directory listing.
inurl: This operator searches for web pages that contain a specific word in the URL. For example, inurl:admin will return web pages that have “admin” in their URL. This may show us URLs which lead to admin panels.
site: This operator restricts the search to a specific website or domain. For example, site:mit.edu will return web pages that belong to the MIT domain, while site:edu will return web pages that belong to any .edu domain. It may be used for recon, in case if we need data of targeted country, let’s say France site:fr and may be extremely useful when detecting gov. domains site:gouv.fr
filetype: This operator searches for specific file types, such as PDF, DOC, XLS, etc. For example, filetype
intext: This operator searches for web pages that contain a specific word or phrase within the body of the page. For example, intext:"password" will return web pages that contain the word “password” anywhere on the page, which may indicate password lists or credentials.
allintext: This operator searches for web pages that contain all the words or phrases specified within the body of the page. For example, allintext:"username" will return web pages that contain the word “username” anywhere on the page.
link: This operator searches for web pages that link to a specific URL. For example if you search for link:example.com -site:example.com you will get websites that have “example.com” link. It means that the page you are searching for has a URL of the value you put in “link”.
related: This operator searches for web pages that are similar to a specified web page. For example, related:mit.edu will return web pages that are similar to the MIT website, such as other universities.
cache: This operator shows the version of the web page that Google has in its cache. For example, cache:mit.edu will show the cached version of the MIT website, which may be different from the current version or contain information that has been removed or updated.
Keywords can be combined to create complex and powerful search queries that can reveal information that would not be easily accessible otherwise. Some people call it “private dorks” and sell.
Examples of Google Dorking
To illustrate how Google dorking can be used to find hidden information on the web, here are some examples of real-life cases where Google dorking was hacking purposes.In 2016, a hacker named Hamid Firoozi used Google dorking to find a vulnerability in a New York dam’s computer network. He used the operator inurl:"/cgi-bin/login.pl" to find login pages for web servers that run the Perl programming language. He then exploited a known vulnerability in the login script to gain access to the system. He was later charged by the US Department of Justice for unauthorized access and damage to a protected computer.
In 2020, the source code for the onboard logic units (OLUs) of Mercedes-Benz vans, along with passwords and API tokens for Daimler’s systems, was leaked online by a Swiss researcher who found a misconfigured GitLab server using Google Dorking. The OLUs are devices that connect the vans to the cloud and enable third-party apps to access vehicle data.
Interesting part
Google dorks can be combined to be used together, if I want to search for some debug files of France domain, I can use this dorkinurl:debug.log filetype:log -git -github site:fr
Image [1]
If I want to find Open-Redirect vulnerability, I can use dork with 1 operator:
Код: Скопировать в буфер обмена
inurl:sap/public/bc/icf/
This dork could be considered as private, because not everyone is going to search for sap redirect through google dorking.
BitBucket repos can be found through:
Код: Скопировать в буфер обмена
inurl:repos?visibility=public
If we want some configuration directories, we can use:
Код: Скопировать в буфер обмена
intitle:Index of inurl:/config
This directory may leak user/pass/keys and other sensitive information. Don’t underestimate the power of it.
Image [2]
Image [3]
Obviously we have the https://www.exploit-db.com/google-hacking-database - for dorks, but 2 dorks I wrote above are not there and those dorks can be useful. I advise you to check for most used CVEs by hackers and try to write dorks according to it. Not everything should be “Shodan-ed” if it can be “Googled”.
What are other methods for private dorks?
Other than writing ones from CVEs, you can use news to get the latest news and write dorks based on the keywords. You can search for the specific category you want, let it be streaming. Most famous streaming companies are: Netflix, Disney+, Hulu, and Amazon Prime. The most basic thing we can do is to check which NEW shows are coming to these platforms.
Image [4]
Код: Скопировать в буфер обмена
"Yu" "Hakusho" "trailer" -news -site:youtube.com -site:pinterest.com -site: tiktok.com -site:netflix.com -site:reddit.com -site:instagram.com -site:facebook.com -site: twitter.com -site:google.com -site:linkedin.com -site:amazon.com -site:wikipedia.com
I used the word ”trailer” for results to be mostly fresh, then I tried to exclude the word “news”, and some famous websites. This should give a result that may have SQL injection vulnerabilities EVEN though we didn’t use any keywords related to the vulnerability itself like “inurl: id”.
Actually it is stupid to rely only on 1 source, what actually should be done here is checking news, writing down the keywords that will be mostly fresh and then generating dorks based on them. What I showed above is a very basic method with only 3 keywords.
There is a tool called TSP dork generator, which is available in this forum also: https://xss.is/threads/59696/ (use in VM, may contain malware)
This tool is great because it counts more on “quantity” rather than “quality” by default, thus generating a lot of dorks. To make dorks more “private” we will have to use our own pagetypes, search functions, page formats and etc.
First let’s look at what this tool can do. “Dork Types” are basically the formulas that are used to combine the extensions, keywords and etc. Domain extensions you can understand from the name itself. You can include domain names instead of extensions also. For example, instead of typing .com you can use example.com, for results to be from the example.com and its subdomains. You would want to do that in case if you are doing a specific test. Keywords should be the fresh keywords we got from above (news/social media and etc). Pagetypes are parameters that we are aiming for. They depend on the vulnerability we want to find. If we are searching for SQLi anything with “id” will do the work, if for open redirect, “redirect=”, if for file inclusion “file=” and so on. Obviously one parameter won’t be that helpful, we can ask AI to generate multiple parameters.
Image [5]
Image [6]
Image [7]
Image [8]
Image [9]
PageFormats are basically the filetypes, we can ask AI to generate stuff that may have been missed.
Image [10]
Image [11]
Search Functions are dorks that we will use, you can edit that also, but I prefer it as it is.
Image [12]
By editing everything above and trying to search for new keywords, you will be able to generate private dorks for sure.
Have I found anything myself?
I love big scopes when I automate stuff, so yes, I use google dorking to find some juicy .xml or .yml files. I am gonna be honest, most of them are low quality, but when I get the endpoint I try to scan it for .zip files (img 13) and sometimes it leaks backup files, also the .log files help a lot. The problem here is that automating it without proxies is useless.Image [13]