Monitoring and Tracing High Throughput Systems
One of the consequences of introducing Phobos into any Akka.NET application is that inevitably, the tracing and monitoring instrumentation inside actors is going to have some performance impact.
Therefore, if you're considering using Phobos inside a really high throughput system (i.e. dealing with millions of in-memory messages per node per second) then this article will cover some of the practices you can use to help minimize the amount of noise generated by Phobos as well as some configuration options you can use to preserve as much performance as possible.
Best Practice 1: Use Sampling
The easiest option for reducing Phobos' strain on any high throughput system is to use the sampling features made available in both Phobos.Monitoring and Phobos.Tracing.
The sampling strategies used inside Phobos are probabilistic, meaning that if you have the following configuration:
phobos{
tracing.sample-rate = 0.1
monitoring.sample-rate = 0.1
}
Then you have a 10% chance of any individual metric being captured and recorded and a 10% chance of a new trace being started inside Phobos.
However, sampling works slightly differently in Phobos.Monitoring vs. Phobos.Tracing:
- In Phobos.Monitoring, all metric events (i.e. recording a single value for a counter, gauge, or timing) are independent from each other. Therefore the 10% sample rate applies evenly to all metrics.
- In Phobos.Tracing, trace events are dependent on each other - once a trace has begun all subsequent operations will be included inside the recorded trace regardless of whether or not sampling is set to 10% of 100%. The only thing the sample rate effects is the likelihood of a new trace starting.
This brings us to the next best practice for using Phobos and distributed tracing inside high throughput Akka.NET systems.
Best Practice 2: Use Whitelist Filtering for Tracing
We have in-depth documentation that explains the functionality and role of the message filtering settings inside Phobos.Tracing, but in a high performance context whitelist filtering is going to give you the most bang for your buck.
var config = @"
akka.actor.provider = ""Phobos.Actor.PhobosActorRefProvider,Phobos.Actor""
phobos.tracing{
provider-type = test
filter{
mode = whitelist
message-types = [
""Phobos.Docs.Samples.Tests.Tracing.IFilteredMessage, Phobos.Docs.Samples.Tests""
]
}
}
";
sys = ActorSystem.Create("PhobosTest", config);
var tracer = (MockTracer) ActorTracing.For(sys).Tracer;
var actor = sys.ActorOf(Props.Create(() => new EchoActor()), "echo");
// send a message that will be filtered out by whitelist
actor.Ask<NormalMessage>(new NormalMessage("hi"), TimeSpan.FromMilliseconds(100)).Wait();
tracer.FinishedSpans().Count.Should().Be(0); // no trace recorded
// send a message that won't be filtered out by whitelist
actor.Ask<FilteredMessage>(new FilteredMessage("bye"), TimeSpan.FromMilliseconds(100)).Wait();
AwaitAssert(() => tracer.FinishedSpans().Count.Should().Be(1));
In the example here, traces can only begin when a message of type Phobos.Docs.Samples.Tests.Tracing.IFilteredMessage
is processed by an actor. Therefore, none of the other message sends in your ActorSystem
are going to be able to begin a trace, so the trace data generated by Phobos will be limited only to the messages result from a IFilteredMessage
being processed the first time.
This will help significantly reduce the amount of trace data produced inside your ActorSystem
and it will also reduce the lion's share of the CPU penalty introduced by Phobos.Tracing's instrumentation for messages that are filtered out.
Best Practice 3: Enable Monitoring and Tracing Only Where It's Needed
In addition to or instead of sampling and filtering for tracing, the other robust practice we can use to manage the impact of Phobos inside high throughput Akka.NET applications is to simply turn off tracing and monitoring where it's less valuable and leave it on only where it is.
The easiest way to do this is to use a Phobos configuration that looks like the following:
akka.actor{
provider = "Phobos.Actor.PhobosActorRefProvider, Phobos.Actor"
deployment{
# use custom sample rates and filtering for high-throughput actors
/coordinator{
phobos{
# ensure that all descendants of /user/coordinator have these settings
propagate-settings-to-children = on
tracing{
sample-rate = 0.1 #10% sample rate
filter{
mode = whitelist
message-types = [
"Phobos.Docs.Samples.Tests.Tracing.IFilteredMessage, Phobos.Docs.Samples.Tests",
"System.Int32"
]
}
}
monitoring{
sample-rate = 0.1 #10% sample rate
monitor-mailbox-depth = off # don't monitor mailboxes
}
}
}
}
}
phobos{
tracing.enabled = off
monitoring.enabled = off
}
Under these settings, all actors except for those deployed under the /user/coordinator
path won't monitor anything by default. However, the /user/coordinator
actor and all of its children, grandchildren, and any other descendant actors will record tracing and monitoring events at a 10% sample rate with the whitelist filter applied for tracing; this is because the phobos.propagate-settings-to-children
option is to set to on
and this value is applied recursively down the hierarchy.
As you can see in our benchmarks, the performance impact of Phobos on actors when tracing and monitoring aren't enabled is within the margin of error. Therefore, with this configuration most of the actors will continue to process messages at their usual throughput and memory consumption even with Phobos installed.