Question regarding ossuary queries integration in Veliciraptor

85 views
Skip to first unread message

Carlos Lopez

unread,
Jan 29, 2021, 5:05:36 AM1/29/21
to velocirapt...@googlegroups.com

Good morning,

 

As I mentioned in a previous thread, I am in the process of integrating all osquery queries into Velociraptor. As a first step I want to bring them into Velociraptor as is, using the artifact developed for it.

 

My question is how to create a single artefact with all the queries included in some osquery packs, should I register them one by one or can I group them inside Velociraptor as if they were an osquery pack?

 

Thank you very much for your help.

Regards,

C. L. Martinez

 

Mike Cohen

unread,
Jan 29, 2021, 8:23:17 AM1/29/21
to Carlos Lopez, velocirapt...@googlegroups.com
Hi Carlos,

The main challenge about processing the results of multiple queries is the shape of each query - by shape i mean the columns and types of each column. In particular if we need to feed the results into another tool these need to be consistent - i.e. each query needs to have a single shape. 

You can of course just feed all the queries in the same artifact but this will be hard to use because there will not be a consistent shape - you can imagine the columns will change half way through the query. In fact the Velociraptor GUI relies on a consistent shape to draw the table in the UI and might get confused or omit columns if the shape changes from row to row.

If you dont care about the gui or post processing you can easily just do something like this:

```
name: Custom.Query.Pack
parameters:
- name: Queries
  type: csv
  default: |
     Query
     SELECT * FROM ...
     SELECT * FROM ....
     SELECT * FROM ...

sources:
- query: |
    SELECT * FROM foreach(row=Queries,
     query={ SELECT * FROM Artifact.Generic.Windows.OSQuery(query=Query) })
```

But if you want to keep the tables separate you can put them in their own sources
```
name: Custom.Query.Path
sources:
- name: FirstQuery
  query: |
       LET Query = "SELECT * FROM ...."
       SELECT * FROM Artifact.Generic.Windows.OSQuery(query=Query) }
- name: SecondQuery
  query: |
       LET Query = "SELECT * FROM ...."
       SELECT * FROM Artifact.Generic.Windows.OSQuery(query=Query) }
- name: ThirdQuery
  query: |
       LET Query = "SELECT * FROM ...."
       SELECT * FROM Artifact.Generic.Windows.OSQuery(query=Query) }
- name: FourthQuery
  query: |
       LET Query = "SELECT * FROM ...."
       SELECT * FROM Artifact.Generic.Windows.OSQuery(query=Query) }
```

Although the second form is more verbose it is better because it makes it easier to see the tables as separate and also you will be able to target each source separately in the notebook for post processing.

Thanks
Mike

Mike Cohen 
Digital Paleontologist, 
Velocidex Enterprises
M  ‭+61 470 238 491‬ 
mi...@velocidex.com 


--
You received this message because you are subscribed to the Google Groups "velociraptor-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to velociraptor-dis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/velociraptor-discuss/F79B6F61-25F2-40CC-9858-8761FA350B66%40outlook.com.

Mike Cohen

unread,
Jan 29, 2021, 8:25:23 AM1/29/21
to Carlos Lopez, velocirapt...@googlegroups.com
If you already have OSQuery query packs it should be trivial to write some python to generate the multi source artifact for each pack.  Let me know if you need a hand writing such a script...

Thanks
Mike

Mike Cohen 
Digital Paleontologist, 
Velocidex Enterprises
M  ‭+61 470 238 491‬ 
mi...@velocidex.com 

Carlos Lopez

unread,
Jan 29, 2021, 11:26:44 AM1/29/21
to Mike Cohen, velocirapt...@googlegroups.com

Many thanks Mike for your help. As an example:

 

"process_events":{

      "query": "SELECT auid, cmdline, ctime, cwd, egid, euid, gid, parent, path, pid, time, uid FROM process_events WHERE path NOT IN ('/bin/sed', '/usr/bin/tr', '/bin/gawk', '/bin/date', '/bin/mktemp', '/usr/bin/dirname', '/usr/bin/head', '/usr/bin/jq', '/bin/cut', '/bin/uname', '/bin/basename') and cmdline NOT LIKE '%_key%' AND cmdline NOT LIKE '%secret%';",

      "interval": 10,

      "description": "Process events collected from the audit framework"

    },

    "socket_events":{

      "query": "SELECT action, auid, family, local_address, local_port, path, pid, remote_address, remote_port, success, time FROM socket_events WHERE success=1 AND path NOT IN ('/usr/bin/hostname') AND remote_address NOT IN ('127.0.0.1', '169.254.169.254', '', '0000:0000:0000:0000:0000:0000:0000:0001', '::1', '0000:0000:0000:0000:0000:ffff:7f00:0001', 'unknown', '0.0.0.0', '0000:0000:0000:0000:0000:0000:0000:0000');",

      "interval": 10,

      "description": "Socket events collected from the audit framework"

    },

    "file_events": {

      "query": "SELECT * FROM file_events;",

      "interval": 10,

      "description": "File events collected from file integrity monitoring",

      "removed":false

    },

    "apt_sources": {

      "query": "SELECT * FROM apt_sources;",

      "interval": 86400,

      "description": "Display apt package manager sources.",

      "snapshot": true,

      "platform": "ubuntu"

    },

    "authorized_keys": {

      "query": "SELECT * FROM users CROSS JOIN authorized_keys USING (uid);",

      "interval": 86400,

      "description": "A line-delimited authorized_keys table."

    },

    "behavioral_reverse_shell": {

      "query": "SELECT DISTINCT(processes.pid), processes.parent, processes.name, processes.path, processes.cmdline, processes.cwd, processes.root, processes.uid, processes.gid, processes.start_time, process_open_sockets.remote_address, process_open_sockets.remote_port, (SELECT cmdline FROM processes AS parent_cmdline WHERE pid=processes.parent) AS parent_cmdline FROM processes JOIN process_open_sockets USING (pid) LEFT OUTER JOIN process_open_files ON processes.pid = process_open_files.pid WHERE (name='sh' OR name='bash') AND remote_address NOT IN ('0.0.0.0', '::', '') AND remote_address NOT LIKE '10.%' AND remote_address NOT LIKE '192.168.%';",

      "interval": 600,

      "description": "Find shell processes that have open sockets"

    },

    "cpu_time": {

      "query": "SELECT * FROM cpu_time;",

      "interval": 3600,

      "description": "Displays information from /proc/stat file about the time the CPU cores spent in different parts of the system"

    },

    "crontab": {

      "query": "SELECT * FROM crontab;",

      "interval": 3600,

      "description": "Retrieves all the jobs scheduled in crontab in the target system."

    },

    "crontab_snapshot": {

      "query": "SELECT * FROM crontab;",

      "interval": 86400,

      "description": "Retrieves all the jobs scheduled in crontab in the target system.",

      "snapshot": true

    }

 

What could be the best approach as you have explained?

 

From: Mike Cohen <mi...@velocidex.com>
Date: Friday, 29 January 2021 at 14:25
To: Carlos Lopez <clo...@outlook.com>
Cc: "velocirapt...@googlegroups.com" <velocirapt...@googlegroups.com>
Subject: Re: Question regarding ossuary queries integration in Veliciraptor

 

If you already have OSQuery query packs it should be trivial to write some python to generate the multi source artifact for each pack.  Let me know if you need a hand writing such a script...

 

Thanks

Mike


Image removed by sender.


Image removed by sender.

Mike Cohen 
Digital Paleontologist, 
Velocidex Enterprises

M  +61 470 238 491 

mi...@velocidex.com 

Wes Lambert

unread,
Jan 29, 2021, 2:02:00 PM1/29/21
to Carlos Lopez, Mike Cohen, velocirapt...@googlegroups.com
HI Carlos,

You could try something like the following ( a crude transformation from your output, with opening/closing curly added, and obviously the script could be optimized for input), but it won't account for the interval.   @Mike Cohen would know better how to approach that. Without specifying within the artifact for each query, I believe you would need to schedule an event/queries if you want it to happen on a regular basis.  Another option might be to modify the main OSQ artifact to specify an interval, or to manage the packs/config using Velociraptor.

#!/usr/bin/python3
import json
import yaml

f = open('pack.json', 'r')
jsonData = json.load(f)
f.close()
ff = open('Pack.yaml', 'w+')

os_name="Linux"
artifact_name = "Custom.Query.Pack\n"
ff.write("name: " + artifact_name)
ff.write("sources:")
for i in jsonData:
    query_name = i
    query_description = jsonData[i]['description']
    query = jsonData[i]['query']
    query_block = """
  - name: %s
    description: %s
    query: |
      LET Query = "%s"
      SELECT * FROM Artifact.Windows.OSQuery.Generic(query=Query)""" % (query_name, query_description, query, os_name)
    ff.write(query_block)
ff.close()

..which would give you something like:

name: Custom.Query.Pack
sources:
  - name: process_events

    description: Process events collected from the audit framework
    query: |
      LET Query = "SELECT auid, cmdline, ctime, cwd, egid, euid, gid, parent, path, pid, time, uid FROM process_events WHERE path NOT IN ('/bin/sed', '/usr/bin/tr', '/bin/gawk', '/bin/date', '/bin/mktemp', '/usr/bin/dirname', '/usr/bin/head', '/usr/bin/jq', '/bin/cut', '/bin/uname', '/bin/basename') and cmdline NOT LIKE '%_key%' AND cmdline NOT LIKE '%secret%';"
      SELECT * FROM Artifact.Generic.Windows.OSQuery(query=Query)
  - name: socket_events

    description: Socket events collected from the audit framework
    query: |
      LET Query = "SELECT action, auid, family, local_address, local_port, path, pid, remote_address, remote_port, success, time FROM socket_events WHERE success=1 AND path NOT IN ('/usr/bin/hostname') AND remote_address NOT IN ('127.0.0.1', '169.254.169.254', '', '0000:0000:0000:0000:0000:0000:0000:0001', '::1', '0000:0000:0000:0000:0000:ffff:7f00:0001', 'unknown', '0.0.0.0', '0000:0000:0000:0000:0000:0000:0000:0000');"
      SELECT * FROM Artifact.Generic.Windows.OSQuery(query=Query)
...

or maybe you could try something like the following:

#!/usr/bin/python3
import json
import yaml

f = open('pack.json', 'r')
jsonData = json.load(f)
f.close()
ff = open('Pack.yaml', 'w+')

os_name="Linux"
artifact_name = "Custom.Query.Pack\n"
ff.write("name: " + artifact_name)
ff.write("type: CLIENT_EVENT\n")
ff.write("sources:")
for i in jsonData:
    query_name = i
    query_description = jsonData[i]['description']
    query_interval = jsonData[i]['interval']
    query = jsonData[i]['query']
    query_block = """
  - name: %s
    description: %s
    queries:
      - LET Query = "%s"
      - SELECT * from foreach(
          row={
            SELECT UnixNano FROM clock(period=atoi(string=%s))
          },
          query={
            SELECT * FROM Artifact.%s.OSQuery.Generic(query=Query)
          }
        )""" % (query_name, query_description, query, query_interval, os_name)
    ff.write(query_block)
ff.close()

...which could give you a client monitoring event with the queries scheduled:

name: Custom.Query.Pack
type: CLIENT_EVENT
sources:
  - name: process_events

    description: Process events collected from the audit framework
    query: |
      LET Query = "SELECT auid, cmdline, ctime, cwd, egid, euid, gid, parent, path, pid, time, uid FROM process_events WHERE path NOT IN ('/bin/sed', '/usr/bin/tr', '/bin/gawk', '/bin/date', '/bin/mktemp', '/usr/bin/dirname', '/usr/bin/head', '/usr/bin/jq', '/bin/cut', '/bin/uname', '/bin/basename') and cmdline NOT LIKE '%_key%' AND cmdline NOT LIKE '%secret%';"
      SELECT * from foreach(
          row={
            SELECT UnixNano FROM clock(period=atoi(string=10))
          },
          query={
            SELECT * FROM Artifact.Linux.OSQuery.Generic(query=Query)
          }
        )
  - name: socket_events

    description: Socket events collected from the audit framework
    query: |
      LET Query = "SELECT action, auid, family, local_address, local_port, path, pid, remote_address, remote_port, success, time FROM socket_events WHERE success=1 AND path NOT IN ('/usr/bin/hostname') AND remote_address NOT IN ('127.0.0.1', '169.254.169.254', '', '0000:0000:0000:0000:0000:0000:0000:0001', '::1', '0000:0000:0000:0000:0000:ffff:7f00:0001', 'unknown', '0.0.0.0', '0000:0000:0000:0000:0000:0000:0000:0000');"
      SELECT * from foreach(
          row={
            SELECT UnixNano FROM clock(period=atoi(string=10))
          },
          query={
            SELECT * FROM Artifact.Linux.OSQuery.Generic(query=Query)
          }
        )

I haven't tested these, and they may not work at all, but hope it helps to point you in the right direction.



--

Wes Lambert

unread,
Jan 29, 2021, 2:08:31 PM1/29/21
to Carlos Lopez, Mike Cohen, velocirapt...@googlegroups.com
Apologies, first one should be:

#!/usr/bin/python3
import json
import yaml

f = open('pack.json', 'r')
jsonData = json.load(f)
f.close()
ff = open('Pack.yaml', 'w+')

os_name="Linux"
artifact_name = "Custom.Query.Pack\n"
ff.write("name: " + artifact_name)
ff.write("sources:")
for i in jsonData:
    query_name = i
    query_description = jsonData[i]['description']
    query = jsonData[i]['query']
    query_block = """
  - name: %s
    description: %s
    query: |
      LET Query = "%s"
      SELECT * FROM Artifact.%s.OSQuery.Generic(query=Query)""" % (query_name, query_description, query, os_name)
    ff.write(query_block)
ff.close()

Mike Cohen

unread,
Jan 29, 2021, 5:37:36 PM1/29/21
to Wes Lambert, Carlos Lopez, velociraptor-discuss
Not an osquery expert :-) what does interval mean?

Do you mean to run these queries repeatedly and report differences?

Wes Lambert

unread,
Jan 30, 2021, 7:39:27 AM1/30/21
to Mike Cohen, Carlos Lopez, velociraptor-discuss
Essentially, yes -- based on the defined interval (n seconds) provided to osquery, it will present results in a differential fashion, unless you specify for a snapshot.

Mike Cohen

unread,
Jan 30, 2021, 8:34:44 AM1/30/21
to Wes Lambert, Carlos Lopez, velociraptor-discuss
So what is the process_events() table ? is it a proper event driven source? 

Looking at the source it appears that osquery single shot does not work with "event" tables. Certainly running this query does not work:
$ osquery-4.4.0-linux-amd64 "SELECT * FROM process_events"
W0130 13:32:48.475754 2751042 virtual_table.cpp:967] Table process_events is event-based but events are disabled
W0130 13:32:48.475929 2751042 virtual_table.cpp:974] Please see the table documentation: https://osquery.io/schema/#process_events

Looking at the code I think it uses the auditd to read these events and so it needs to be constantly running. You can easily convert those queries to VQL (using the auditd() plugin) but this is probably not compatible with shelling out to osquery.

Thanks
Mike


Mike Cohen 
Digital Paleontologist, 
Velocidex Enterprises
M  ‭+61 470 238 491‬ 
mi...@velocidex.com 

Carlos Lopez

unread,
Jan 30, 2021, 10:49:18 AM1/30/21
to Mike Cohen, Wes Lambert, velociraptor-discuss

Many thanks Wes and Mike. I will try both approaches explained by Wes, and I wil see how it goes …

 

I will keep you informed.

 

Regards.

 

From: Mike Cohen <mi...@velocidex.com>
Date: Saturday, 30 January 2021 at 14:34
To: Wes Lambert <wlamb...@gmail.com>
Cc: Carlos Lopez <clo...@outlook.com>, velociraptor-discuss <velocirapt...@googlegroups.com>
Subject: Re: Question regarding ossuary queries integration in Veliciraptor

 

So what is the process_events() table ? is it a proper event driven source? 

 

Looking at the source it appears that osquery single shot does not work with "event" tables. Certainly running this query does not work:

$ osquery-4.4.0-linux-amd64 "SELECT * FROM process_events"
W0130 13:32:48.475754 2751042 virtual_table.cpp:967] Table process_events is event-based but events are disabled
W0130 13:32:48.475929 2751042 virtual_table.cpp:974] Please see the table documentation: https://osquery.io/schema/#process_events

 

Looking at the code I think it uses the auditd to read these events and so it needs to be constantly running. You can easily convert those queries to VQL (using the auditd() plugin) but this is probably not compatible with shelling out to osquery.

 

Thanks

Mike

 


Error! Filename not specified.


Error! Filename not specified.

Reply all
Reply to author
Forward
0 new messages