Welcome to the fascinating—and occasionally terrifying—world of Open Source INTelligence (OSINT), an essential element in every cybersecurity toolkit. Whether you’re a cybersecurity aspirant, simply curious about intelligence collection and analysis in the digital world, or a parent interested in the children’s online safety, OSINT can be a useful tool. My aim is to give you a bit of history and help you navigate you through OSINT’s main foci and real-world applications, while offering insights on how protect yourself online.

A (Very) Brief History of OSINT

OSINT traces its origins back to the days when intelligence was gathered from newspapers, television and radio broadcasts, and other publicly-available sources. Historically, governments used OSINT to supplement their classified collection operations such as collecting and analyzing Electronic Signals (SIGINT) and soliciting otherwise publicly-unavailable intelligence information from Humans (HUMINT) with special access to that data.

The advent of the internet and the subsequent explosion of data generated daily have dramatically expanded the scope and importance of OSINT. It has transitioned from being a supplementary source of intelligence to a critical component of national security strategies, business intelligence efforts, and cybersecurity measures. The ability to gather, analyze, and interpret this openly available information is now a prized skill in the arsenal of cybersecurity, law enforcement, and intelligence professionals worldwide.

With the ubiquity of interconnected electronic devices, OSINT has come into its own as a powerful tool for the modern amateur spy. Today, OSINT encompasses a vast array of information available online—from public records to social media platforms to digital publications and datasets.

Major OSINT Sources

OSINT encompasses a wide range of sources. Let’s take a closer look at some of its main components:

Academic and Professional Publications

Research papers, dissertations, and industry reports often contain cutting-edge research and expert insights. They are an excellent resource for in-depth, authoritative information on everything from national defense, technology breakthroughs, and scientific research.

These are scholarly articles, research papers, dissertations, conference proceedings, and industry reports. They provide in-depth analysis, experimental results, and expert insights on a wide range of subjects. For instance, a cybersecurity researcher might use academic papers to stay updated on the latest findings in network security vulnerabilities. Professional publications like white papers from technology companies can reveal emerging trends and new technologies in the industry. Examples include journals like “Journal of Cybersecurity” or industry reports from companies like Gartner or IBM.

Commercial Data Sources

These sources encompass databases and services that compile and sell information, often used for business intelligence, market research, or customer profiling. For instance, data brokers like Experian or LexisNexis aggregate vast amounts of data on individuals, which can include consumer habits, credit histories, and even public records. For businesses, commercial datasets from providers like Bloomberg or Statista offer valuable insights into market trends, economic forecasts, and industry analyses.

Online Publications and News

This area includes digital newspapers, e-zines, blogs, online news portals, and even newsletters. They are key for real-time information on current events, trends, and public opinion. For example, an OSINT analyst might use The New York Times or BBC News to gather information on a recent cyber attack or political event. Tech blogs like TechCrunch or Wired provide insights into the latest technology trends and product releases, which can be crucial for tech-related investigations.

Public Records and Data

Public records are documents or pieces of information that are not considered confidential. This could include birth and death records, marriage licenses, property records, court documents, and government reports. The United States Patent and Trademark Office (USPTO) database provides information on patents and trademarks, which can be vital for intellectual property research. Similarly, property records, which are often available online through local government websites, can reveal ownership details of a particular property.

Would it shock you to know that, if I have your address, it would be child’s play for me to get aerial satellite and ground-level photographs of your home? And I’d never even have to visit the city you live in or risk getting caught gathering intelligence on you.

User-Generated Content

This encompasses the vast amount of content created and shared by users on platforms such as social media, forums, blogs, and video sharing platforms. For instance, Twitter and Facebook can be mined for public sentiment on a specific topic, trends, or even to track the activities of a particular individual or group. TripAdvisor reviews might be used to assess the popularity and customer experience of a tourist spot. Reddit forums can provide insights into niche communities and topics, offering raw, unfiltered opinions and discussions.


Real-World Applications of OSINT in Cybersecurity

In the cybersecurity arena, OSINT is used for a myriad of purposes, including enhancing security postures and providing valuable insights:

Cyber Threat Intelligence (CTI)

Cybersecurity professionals can use OSINT to identify potential threats and vulnerabilities. By monitoring hacker forums, social media, professional/governmental CTI feeds, and other platforms, they can uncover and track emerging threats and trends.

Security Awareness and Assessment

Organizations (and individuals) can utilize OSINT to assess their security posture. This can involve searching for data leaks, monitoring the Dark Web, identifying exposed assets, searching public job postings for indications of technologies (and versions) used by that organization as well as any staffing shortcomings, and learning the Tactics, Techniques, and Procedures (TTPs) of potential attackers.

Have you ever Googled yourself? You might be shocked to find out how much information about you (and your loved ones) is available online for anyone to find and exploit.

Investigative Journalism

Journalists can leverage OSINT for investigative purposes, to uncover facts, and to gather evidence to support their stories.

Law Enforcement/Intelligence Agencies

Law enforcement agencies use OSINT for criminal investigations; it can aid in gathering critical information about criminal activities, locations, and associations. Needless to say, intelligence agencies also use OSINT as another source of corroborating information to back up or enhance their classified collection/analysis efforts.

Tools of the Trade

There is a plethora of publicly-available OSINT tools ranging from simple web search engines to sophisticated software that aggregates and analyzes data from multiple sources. Tools like Maltego and Shodan are quite popular among professionals. Many of these tools come pre-installed in Kali, a special distribution of Linux used by many offensive cybersecurity specialists.

Note that some tools require the user to purchase a license in order to utilize them at their full potential.

Shodan

Shodan is the world’s first search engine for Internet-connected devices. Unlike traditional search engines that index web content, Shodan scans for information about devices and services such as servers, cameras, printers, routers, and other devices connected to the internet, providing detailed information about each device’s Internet Protocol (IP) addresses, open ports, known vulnerabilities, and the type of software running on it. Shodan is an invaluable tool for security professionals and researchers to help identify vulnerable devices and systems exposed online.

Maltego

Maltego is a powerful tool for conducting open-source intelligence and forensics. It offers a depth and breadth of perspective as it focuses on link analysis. With Maltego, users can gather data from various sources and visualize the relationships and networks among different entities, such as people, groups, domains, and networks. This ability to graph complex information networks is especially useful in cyber investigations, where understanding the connections between different data points is key.

Maltego is very useful in a variety of capacities:

  • Cybersecurity Investigations: Security professionals use Maltego for cyber threat analysis. It helps in mapping out the network infrastructure of potential attackers, understanding relationships between different nodes, and identifying vulnerabilities in a system.
  • Digital Forensics: In digital forensics, Maltego can be used to uncover patterns and connections in digital evidence. It’s particularly useful in complex cases where large amounts of data need to be correlated.
  • Fraud Detection: Financial institutions and law enforcement agencies use Maltego to track and visualize the networks involved in fraudulent activities, such as phishing attacks, financial frauds, and scam operations.
  • Social Network Analysis: Maltego can analyze social networks to understand relationships and hierarchies within a group, which is useful in intelligence and law enforcement for investigating criminal networks or in business for market research.
  • Corporate Intelligence: Businesses use Maltego for competitive intelligence gathering. It helps in mapping out a competitor’s online presence, partnerships, and digital assets.
  • Law Enforcement and Counterterrorism: Law enforcement agencies and counterterrorism units use Maltego to uncover connections between individuals, locations, and organizations in criminal networks or terrorist groups.
  • Investigative Journalism: Journalists use Maltego for investigating stories, especially those that involve complex connections between entities, like in cases of corruption or international politics.
  • Research and Academic Studies: Researchers and academics use Maltego for a variety of studies that require mapping relationships and connections in large sets of data, ranging from social sciences to cybersecurity.
  • Human Resources and Background Checks: HR departments and background check companies use Maltego to research potential employees’ digital footprints and connections.

Maltego’s plentiful transforms—basically plugins that fetch information from specific sources and format the results in useful ways—coupled with its flexibility in integrating with various data sources and its powerful mapping/graphing capabilities, make it an effective tool of choice for professionals who need to analyze complex networks and relationships in diverse fields.

theHarvester

theHarvester is designed to gather sensitive information from various public sources like search engines and social media platforms. It’s particularly effective in collecting email addresses, subdomains, host names, and employee names. This information can be used in penetration testing or cybersecurity reconnaissance to understand a target’s digital footprint. Its simplicity and effectiveness make it a favorite among penetration testers for initial data gathering phases.

Recon-ng

Recon-ng is a full-featured web reconnaissance framework written in Python. It has a look and feel similar to the Metasploit Framework, providing an interactive environment to conduct open-source web-based reconnaissance quickly and thoroughly. Recon-ng is modular, allowing users to leverage its powerful framework to write their own modules. It’s packed with a variety of plugins that gather intelligence from various public sources and is highly valued for its efficiency and integration capabilities.

OSINT Framework

While not a tool, per se, the OSINT Framework is a collection of tools and resources categorized by the type of data they are used to collect. This framework is extremely useful for anyone in the cybersecurity field, as it provides a comprehensive directory of resources for gathering open-source intelligence. From domain name lookups to exploiting social networks, this framework guides users to the appropriate tools for every conceivable type of OSINT research.

Google Dorks

I know… I know. The name sounds a bit silly and may even elicit giggles from the more adolescent among us. But Google Dorks can be quite powerful in the hands of someone who knows how to use them.

Google Dorks employ advanced search operators in Google to help researchers find specific strings of text within search results. It’s a method used to uncover hidden information and vulnerabilities in websites. For example, using specific syntax, one can find files containing passwords or sensitive information inadvertently left accessible on web servers. Google Dorks is more of a technique than a specific tool; it’s widely used for security reconnaissance.

TinEye

TinEye is a reverse image search engine that can track where an image came from, how it is being used, if modified versions of the image exist, or if there is a higher resolution version. This tool is especially useful in digital investigations, verifying the authenticity of images, and in the field of intellectual property and copyright, where identifying the use of an image across the internet is crucial.

Creepy

Creepy is an OSINT geolocation—determining the geographic position of a person or object—tool. It allows users to gather geographical information from social media platforms, images, and other sources. By aggregating all the geolocation data associated with a user’s social media posts, Creepy can map out the physical movement patterns of individuals, which is a powerful tool in investigations and intelligence gathering, but also raises significant privacy concerns.

BuiltWith

BuiltWith is a tool that can identify the technology stack used to build a website, including web servers, analytics tools, JavaScript libraries, and more. This information can be invaluable for competitive intelligence, sales intelligence, or cybersecurity. By understanding the technologies used on a website, one can infer potential website vulnerabilities, technology trends, or software preferences of target markets.

SpiderFoot

SpiderFoot is an open-source tool used for automating the process of gathering intelligence about a website, IP address, or domain name. It can collect a wide range of data including network information, web server details, email addresses, and more. SpiderFoot aggregates this information from over 100 different sources and presents it in a coherent manner. It’s particularly useful for in-depth investigations, providing a comprehensive view of a target’s online presence, vulnerabilities, and potential security exposures.

Ethics and Legal Considerations in OSINT

While OSINT seeks to exploit publicly-available information, there are ethical and legal considerations. In the process of conducting your OSINT investigation, it’s crucial to respect privacy, adhere to data protection laws, and ensure that the intelligence gathering and analysis activities are compliant with applicable laws. Unauthorized access, even to publicly available information, could potentially lead to ethical quandaries at best and legal repercussions at worst. While OSINT provides powerful capabilities, it must be used responsibly and ethically.

Another important thing to consider is that if access to information requires signing up for an account (e.g., accessing a social media site), you may be limited by legal agreements as to what you can do with information available to you on those platforms. Even though data may seem open, it may not be. My rule of thumb is that if you need a password to access it, the intelligence information may no longer be, legally speaking, Open Source.

Protecting Yourself Online

In an age where Personally Identifiable Information (PII) is abundant online, protecting your private data from becoming part of someone else’s OSINT research is crucial. Consider the following:

  • Personal Data Management: Be conscious of the information you share online, especially when it comes to minor children or other sensitive areas of your life, such as location, absence(s) from the home, healthcare, travel plans, etc. Regularly review and manage your digital footprint.
  • Privacy Settings: Regularly review and adjust privacy settings on social media and other online platforms to control who can view your information. Restricting social media posts to select, trusted family members, friends, and acquaintances could minimize exploitation of personal information you might prefer be kept personal.
  • Awareness of Digital Trails: Be aware of the trails you leave on the internet. This includes being cautious about the information you post, the sites you visit, and the networks you use. Have you ever noticed that, right after searching for something, ads for that very thing seem to magically show up on your social media feeds? Cookies abound.

Getting Started

For those interested in exploring OSINT, many educational resources—both free and paid—are available. Online courses, webinars, forums, and communities offer a wealth of knowledge and support. If you’re just starting out, begin with foundational research and intelligence analysis concepts and gradually delve into specialized OSINT resources. Practical experience, combined with continuous learning, is key to effectively leveraging OSINT.

For starters, I recommend acquiring the aforementioned OSINT Framework and poking around until you find something interesting. Tools like Maltego can be intimidating for beginners, but there are many YouTube videos available to help teach you the ropes.

OSINT is a fascinating, dynamic field, integral to many sectors including national security, law enforcement, cybersecurity, and research. As the real world continues to expand into the online world, so does the relevance and importance of OSINT, which includes the development of an awareness of our own digital footprints in the form of monitoring what data about us exists online (willingly shared or not). I encourage you to delve deeper into this field, armed with the knowledge and tools to use it responsibly and protect your own data.