OSINT is in many ways the mirror image of operational security (OPSEC), which is the security process by which organizations protect public data about themselves that could, if properly analyzed, reveal damaging truths. In-house security teams perform OSINT operations on their own organizations to shore up operational security. They try to find sensitive information that the company might not realize is public. This allows them to protect exposed data or anticipate what information an attacker might have about the organization. That information is critical when assessing risk, prioritizing security resources, and improving security practices and policies.
The world at the time was changing, and even though social media had not yet made the scene, there were plenty of sources like newspapers and publicly available databases that contained interesting and sometimes useful information, especially if someone knew how to connect a lot of dots. The term OSINT was originally coined to refer to this kind of spycraft.
These same techniques can now be applied to cybersecurity. Most organizations have vast, public-facing infrastructures that span many networks, technologies, hosting services and namespaces. Information can be stored on employee desktops, in legacy on-prem servers, with employee-owned devices, in the cloud, embedded inside devices like webcams, or even hidden in the source code of active apps and programs.
In fact, security and IT staff at large companies almost never knows about every asset in their enterprise, public or not. Add in the fact that many organizations also own or control several additional assets indirectly, such as their social media accounts, and there is potentially a lot of information sitting out there that could be dangerous in the wrong hands.
OSINT is crucial in keeping tabs on that information chaos. IT needs to fulfill three important tasks within OSINT, and a wide range of OSINT tools have been developed to help meet those needs. Most tools serve all three functions, though many excel in one particular area.
Their most common function is helping IT teams discover public-facing assets and mapping what information each possesses that could contribute to a potential attack surface. Their main job is recording what information someone could publicly learn about company assets without resorting to hacking, not looking things like program vulnerabilities or performing penetration testing.
A secondary function that some OSINT tools perform is looking for relevant information outside of an organization, such as in social media posts or at domains and locations that might be outside of a tightly defined network. Organizations that have made a lot of acquisitions, bringing along the IT assets of the company they are merging with, could find this function very useful. Given the extreme growth and popularity of social media, looking outside the company perimeter for sensitive information is probably helpful for just about any group.
Finally, some OSINT tools help to collate and group all the discovered information into useful and actionable intelligence. Running an OSINT scan for a large enterprise can yield hundreds of thousands of results, especially if both internal and external assets are included. Piecing all that data together and being able to deal with the most serious problems first can be extremely helpful.
Using the right OSINT tool for your organization can improve cybersecurity by helping to discover information about your company, employees, IT assets and other confidential or sensitive data that could be exploited by an attacker. Discovering that information first and then hiding or removing it could reduce everything from phishing to denial-of-service (DoS) attacks. Professionals who regularly perform OSINT operations will often use a suite of tools depending on their environment and preferences.
Once the information is gathered, Maltego makes connections that can unmask the hidden relationships between names, email addresses, aliases, companies, websites, document owners, affiliations and other information that might prove useful in an investigation, or to look for potential future problems. The program itself runs in Java, so it works with Windows, Mac and Linux platforms.
There is a free version of the program with limited features called Maltego CE. Desktop versions of Maltego XL run $1,999 per instance. Server installations for large-scale commercial use start at $40,000 and come with a complete training program.
Available as a Chrome extension and Firefox add-on, Mitaka lets you search over six dozen search engines for IP addresses, domains, URLs, hashes, ASNs, Bitcoin wallet addresses, and various indicators of compromise (IOCs) from your web browser. Ax Sharma
Spiderfoot is a free OSINT reconnaissance tool that integrates with multiple data sources to gather and analyze IP addresses, CIDR ranges, domains and subdomains, ASNs, email addresses, phone numbers, names and usernames, BTC addresses, etc. Available on GitHub, Spiderfoot comes with both a command-line interface and an embedded web-server for providing an intuitive web-based GUI.
The application itself comes with over 200 modules making it ideal for red teaming reconnaissance activities, to discover more information about your target or identify what you or your organisation may be inadvertently exposing on the internet.
As the name implies, BuiltWith lets you find what popular websites are built with. Different tech stacks and platforms power different sites. BuiltWith can, for example, detect whether a website is using WordPress, Joomla, or Drupal as its CMS and provide further details.
BuiltWith also generates a neat list of known JavaScript/CSS libraries (e.g., jQuery or Bootstrap) that a website uses. Further, the service provides a list of plugins installed on the websites, frameworks, server information, analytics and tracking information, etc. BuiltWith can be used for reconnaissance purposes.
For those looking to identify mainly the tech stack makeup of a site, Wappalyzer may be better suited as it provides a more focused, concise output. Try both BuiltWith and Wappalyzer for yourself and see which suits your needs better.
Intelligence X has previously preserved the list of over 49,000 Fortinet VPNs that were found vulnerable to a Path Traversal flaw. Later during the week, plaintext passwords to these VPNs were also exposed on hacker forums which, again, although removed from these forums, were preserved by Intelligence X.
How do you search across half million git repos across the internet? Sure, you could try individual search bars offered by GitHub, GitLab, or BitBucket, but Grep.app does the job super efficiently. In fact, Grep.app was recently used by Twitter users and journalists on multiple occasions to get an idea of approximately how many repositories were using the Codecov Bash Uploader:
Grep.app can also be useful when searching for strings associated with IOCs, vulnerable code, or malware (such as the Octopus Scanner, Gitpaste-12, or malicious GitHub Action cryptomining PRs) lurking in OSS repos.
Developers who work in Python have access to a powerful tool in Recon-ng, which is written in that language. Its interface looks very similar to the popular Metasploit Framework, which should reduce the learning curve for those who have experience with it. It also has an interactive help function, which many Python modules lack, so developers should be able to pick it up quickly.
Designed so that even the most junior Python developers can create searches of publicly available data and return good results, it has a very modular framework with a lot of built-in functionality. Common tasks like standardizing output, interacting with databases, making web requests and managing API keys are all part of the interface. Instead of programming Recon-ng to perform searches, developers simply choose which functions they want it to perform and build an automated module in just a few minutes.
The sources that theHarvester uses include popular search engines like Bing and Google, as well as lesser known ones like dogpile, DNSdumpster and the Exalead meta data engine. It also uses Netcraft Data Mining and the AlienVault Open Threat Exchange. It can even tap the Shodan search engine to discover open ports on discovered hosts. In general, theHarvester tool gathers emails, names, subdomains, IPs and URLs.
Shodan is a dedicated search engine used to find intelligence about devices like the billions that make up the internet of things (IoT) that are not often searchable, but happen to be everywhere these days. It can also be used to find things like open ports and vulnerabilities on targeted systems. Some other OSINT tools like theHarvester use it as a data source, though deep interaction with Shodan requires a paid account.
In addition to IoT devices like cameras, building sensors and security devices, Shodan can also be turned to look at things like databases to see if any information is publicly accessible through paths other than the main interface. It can even work with videogames, discovering things like Minecraft or Counter-Strike: Global Offensive servers hiding on corporate networks where they should not be, and what vulnerabilities they generate.
Anyone can purchase a Freelancer license and use Shodan to scan up to 5,120 IP addresses per month, with a return of up to a million results. That costs $59 per month. Serious users can buy a Corporate license, which provides unlimited results and scanning of up to 300,000 IPs monthly. The Corporate version, which costs $899 per month, includes a vulnerability search filter and premium support.
Another freely available tool on GitHub, Metagoofil is optimized to extract metadata from public documents. Metagoofil can investigate almost any kind of document that it can reach through public channels including .pfd, .doc, .ppt, .xls and many others.
The amount of interesting data that Metagoofil can gather is impressive. Searches return things like the usernames associated with discovered documents, as well as real names if available. It also maps the paths of how to get to those documents, which in turn would provide things like server names, shared resources and directory tree information about the host organization.
c80f0f1006