v6.6 CE Release Notes
Created:2025-08-19 Last Modified:2025-08-19
This document was translated by ChatGPT
#1. Backport From 7.0
- AutoTracing
- [2025/01/02] Support collection and tracing of the Tars protocol, documentation.
- [2025/01/16] For non-TCP traffic in network flow logs (
l4_flow_log), change the end status (close_type) from timeout to normal end (1). - [2025/04/02] Support collection and tracing of the Ping protocol, documentation.
- [2025/04/02] Support collection and tracing of the Dubbo protocol when using Fastjson serialization, documentation.
- [2025/04/15] Support parsing MySQL Login Response statements.
- [2025/04/15] Support parsing multiple DNS requests in a TCP Payload.
- [2025/04/28] Enrich eBPF hook points for collecting file read/write events (
io_event) to improve adaptability. - [2025/05/29] Support collecting Unix Socket call logs (
l7_flow_log) and automatic tracing between TCP/UDP Socket call logs and Unix Socket call logs. - [2025/05/29] Support parsing SRV type DNS call logs, documentation (opens new window).
- [2025/05/29] Support parsing truncated MySQL protocol content.
- AutoTagging
- [2025/04/28] Optimize the meaning of the
process_knamefield in call logs and file read/write event data, changing fromkernel thread nametosystem processname for better readability. - [2025/04/28] Aggregate processes with the same
cmdlinewithin the same cloud host or the same K8s workload into a unique gprocess to reduce redundant process information. - [2025/04/28] Optimize default values for the process matcher, documentation.
- By default, ignore collection of process information for
sleep/sh/bash/pause/runc. - By default, collect process information and OnCPU profiling data for
Java/Python, and automatically record the gprocess name as the jar/py file name to avoid all being displayed as java/python. - By default, collect process information and OnCPU profiling data for
deepflow-*. - By default, collect process information in containers.
- By default, ignore collection of process information for
- [2025/04/28] Optimize the meaning of the
response_statusfield in call logs (l7_flow_log).- Normal: Response code is normal.
- Client Error: Response code indicates a client-side error, e.g., HTTP 4XX.
- Server Error: Response code indicates a server-side error, e.g., HTTP 5XX.
- Timeout: If no response is collected within a certain time, the request is marked as timed out.
- Agent
Application session merge timeoutconfiguration: DNS and TLS default 15s, other protocols default 120s, documentation.
- Agent
- Unknown: When concurrent requests exceed the collector's cache capacity, the oldest requests are marked as unknown.
- Agent
Maximum session aggregation entriesconfiguration: Default cache of 64K requests, documentation.
- Agent
- Parse Failed: Response was collected but the response code could not be parsed due to truncation or compression.
- Agent
Payload truncationconfiguration: Default parses the first 1024 bytes of the Payload, documentation.
- Agent
- [2025/06/11] Optimize parsing of unary type gRPC calls, documentation.
- [2025/08/21] Support collecting multiple HTTP2/gRPC requests and responses in a single packet.
- [2025/08/21] Support obtaining the full file path for file read/write events:
- Fully obtain NAS file paths, supporting NFS, SMB, CIFS, and other protocols.
- Fully obtain the absolute path for file read/write inside container Pods.
- [2025/04/28] Optimize the meaning of the
- AutoMetrics
- [2025/08/21] Support aggregation to generate eBPF profiling metric data with 1s granularity to speed up profiling metric queries.
- AutoTagging
- [2025/08/21] Simplify process sync blacklist configuration, documentation.
- [2025/08/21] Adapt to K8s v1.32+ API.
- Server
- [2025/02/11] Support terminating remote upgrades of collectors and optimize CPU resource usage of the Server during upgrades.
- Agent
- [2025/02/11] Support limiting the number of sockets used by deepflow-agent, documentation.
- [2025/03/18] Support collecting traffic from Pod internal NICs, applicable to scenarios where Pod NIC traffic cannot be directly collected under the Root network namespace (e.g., Huawei Cloud CCE Turbo CNI (opens new window)), documentation.
- [2025/04/15] Limit the bandwidth consumption of data sent by the agent, default allowing 100Mbps, documentation.
- [2025/04/28] Optimize memory usage of the cache for application performance metrics in the Agent by timely cleaning up expired LRU entries, reducing overall memory consumption by 43% in test environments.
- [2025/04/28] Aggregate and store flow logs (
l4_flow_log) generated by LB health checks, reducing flow log storage overhead by nearly 50% in some production environments, documentation. - [2025/04/28] Optimize resource overhead protection mechanism when application protocol recognition fails to avoid mistakenly disabling application protocol parsing, documentation.
- [2025/05/16] Support compressed transmission of call logs and flow logs, with a compression ratio of up to 8:1 in test environments, documentation.
- [2025/05/29] When Agent traffic reaches the rate limit, support choosing between
droporwaitstrategies; default is drop, can be configured to wait to improve data transmission success rate, documentation. - [2025/06/11] Add a circuit breaker mechanism for free disk space in the Agent runtime environment, documentation.
- [2025/06/11] Support disabling Agent's use of swap memory, documentation.
- [2025/06/11] Adapt to K8s CNI with identical MAC addresses for virtual NICs on the same host.
- [2025/06/11] Optimization: Reduce work done by the Agent when disabled.
- [2025/08/21] Support Watchdog mechanism to ensure circuit breakers execute properly in extreme cases.
- [2025/08/21] Support compressed transmission of application logs, with compression ratios between 5:1 and 20:1, documentation.
#2. v6.6.9 [2024/12/12]
#2.1 Stable Feature
- AutoTracing
- Support collection and tracing of the Memcached protocol, documentation.
- cBPF data supports Tars protocol parsing, documentation.
- File read/write events support collecting the full path of file names and the offset of read/write files.
- AutoProfiling
- Support CPU performance profiling for Python and CUDA.
- Optimize Java process symbol table synchronization mechanism, reducing transient CPU consumption introduced to business processes by about 50%.
- Improve function stack merging efficiency, reducing resource overhead for function stack reporting, with significant performance improvement in scenarios with many threads of the same name.
- AutoTagging
- When TraceID exists in the protocol header, support disabling eBPF syscall_trace_id calculation (via
syscall_trace_id_disabled) to reduce impact on business performance. - Support completely disabling cBPF data collection (by setting
tap_interface_regexto an empty string) to reduce memory overhead. - Enhance process synchronization capability, documentation.
- Support synchronizing only processes inside containers.
- Support not synchronizing Socket information (only process information).
- When a region whitelist is configured for the cloud platform (Domain), calling the Region API is no longer required.
- Failure to obtain NAT gateway, routing table, or load balancer information from Alibaba Cloud or Tencent Cloud will not affect synchronization of other resource information.
- When TraceID exists in the protocol header, support disabling eBPF syscall_trace_id calculation (via
- Server
- Optimize storage performance of
genesis*related MySQL tables. - Support using ByConity instead of ClickHouse, documentation.
- Support using ClickHouse Enterprise Edition (currently only supported on Alibaba Cloud), documentation (opens new window).
- Optimize storage performance of
- Agent
- Support compressed transmission of profiling data, reducing bandwidth consumption by 30%.
- Application log data supports compressed transmission, reducing bandwidth consumption by 95% (CPU consumption increases by 3%).
- Support deepflow-agent using a single socket to transmit all observability data, and allow disabling this feature via
multiple_sockets_to_ingesterto use multiple sockets for improved transmission performance. - When BTF (BPF Type Format) is enabled on Linux, and the kernel is >= 5.5 (opens new window) on X86 architecture or >= 6.0 (opens new window) on ARM architecture, the agent will automatically use fentry/fexit instead of kprobe/kretprobe, resulting in about 15% performance improvement.
- The original environment variable
ONLY_WATCH_K8S_RESOURCEhas been replaced withK8S_WATCH_POLICY, documentation.
#3. v6.6.8 [2024/11/14]
#3.1 Stable Feature
- Server
- By default, aggregate and generate network performance metrics and application performance metrics with granularity of 1h and 1d.
- Agent
- Configuration refactoring, documentation.
#4. v6.6.7 [2024/10/31]
#4.1 Beta Feature
- AutoTagging
- Enhance process synchronization capability, documentation.
- Support synchronizing only processes inside containers.
- Support not synchronizing Socket information (only process information).
- Enhance process synchronization capability, documentation.
#5. v6.6.6 [2024/10/11]
#5.1 Backward Incompatible Change
- AutoTracing
- To reduce resource overhead and avoid misidentification, the agent will by default only parse the following application protocols (to enable parsing of other protocols, configure
l7-protocol-enabled):- HTTP, HTTP2/gRPC, MySQL, Redis, Kafka, DNS, TLS.
- Reminder: When using Wasm to parse private protocols, please add Custom to
l7-protocol-enabled.
- To reduce resource overhead and avoid misidentification, the agent will by default only parse the following application protocols (to enable parsing of other protocols, configure
#5.2 Stable Feature
- Agent
- Support specifying and disabling K8s List & Watch via environment variables (thanks to
Hyzhou: FR (opens new window), FR (opens new window)). - Reduce eBPF memory overhead of the Agent (thanks to
qyzhaoxun: FR (opens new window)).
- Support specifying and disabling K8s List & Watch via environment variables (thanks to
#6. v6.6.5 [2024/09/24]
#6.1 Beta Feature
- AutoProfiling
- Optimize Java process symbol table synchronization mechanism, reducing transient CPU consumption introduced to business processes by about 50%.
- Improve function stack merging efficiency, reducing resource overhead for function stack reporting, with significant performance improvement in scenarios with many threads of the same name.
- Server
- Optimize storage performance of
genesis*related MySQL tables. - AutoTagging: When a region whitelist is configured for the cloud platform (Domain), calling the Region API is no longer required.
- AutoTagging: Failure to obtain NAT gateway, routing table, or load balancer information from Alibaba Cloud or Tencent Cloud will not affect synchronization of other resource information.
- Optimize storage performance of
- Agent
- When BTF (BPF Type Format) is enabled on Linux, and the kernel is >= 5.5 (opens new window) on X86 architecture or >= 6.0 (opens new window) on ARM architecture, the agent will automatically use fentry/fexit instead of kprobe/kretprobe, resulting in about 15% performance improvement.
- Support compressed transmission of profiling data, reducing bandwidth consumption by 30%.
- The original environment variable
ONLY_WATCH_K8S_RESOURCEhas been replaced withK8S_WATCH_POLICY, documentation.
#6.2 Stable Feature
- AutoTracing
- Support enhancing HTTP2/gRPC call logs using Wasm Plugin (currently not supporting enhancement of eBPF uprobe data), documentation.
- AutoProfiling
- Support stack unwinding using DWARF when Frame Pointer is missing.
- AutoTagging
- Support Alibaba Cloud resource synchronization using a regular account's AK/SK with ResourceGroupId.
#7. v6.6.4 [2024/08/29]
#7.1 Beta Feature
- AutoTracing
- cBPF data supports Tars protocol parsing, documentation.
- AutoProfiling
- Support stack unwinding using DWARF when Frame Pointer is missing.
- AutoTagging
- Support Alibaba Cloud resource synchronization using a regular account's AK/SK with ResourceGroupId.
- Server
- Support using ByConity instead of ClickHouse, documentation.
#7.2 Stable Feature
- AutoTracing
- Automatically correct minor clock drift between different machines in distributed tracing flame graphs.
- AutoTagging
- Support customizing K8s workload abstraction rules using Lua Plugin, documentation.
- Support synchronizing LoadBalancer type container services.
- Server
- Support using OceanBase instead of MySQL.
#8. v6.6.3 [2024/08/15]
#8.1 Beta Feature
- AutoTracing
- When TraceID exists in the protocol header, support disabling eBPF syscall_trace_id calculation (via
syscall_trace_id_disabled) to reduce impact on business performance. - Automatically correct minor clock drift between different machines in distributed tracing flame graphs.
- When TraceID exists in the protocol header, support disabling eBPF syscall_trace_id calculation (via
- AutoTagging
- Support customizing K8s workload abstraction rules using Lua Plugin, documentation.
- Agent
- Support completely disabling cBPF data collection (by setting
tap_interface_regexto an empty string) to reduce memory overhead. - Support deepflow-agent using a single socket to transmit all observability data, and allow disabling this feature via
multiple_sockets_to_ingesterto use multiple sockets for improved transmission performance.
- Support completely disabling cBPF data collection (by setting
#8.2 Stable Feature
- AutoProfiling
- Support viewing DeepFlow eBPF On-CPU Profiling data in Grafana Panel, Demo (opens new window).
- AutoMetrics
- Support aligning timestamps of request and response metrics within the same session to help AIOps systems better perform root cause analysis (thanks to
pegasusljn: FR (opens new window)).
- Support aligning timestamps of request and response metrics within the same session to help AIOps systems better perform root cause analysis (thanks to
- AutoTagging
- Correctly tag Universal Tag for loopback NIC traffic on K8s Nodes.
- Agent
- Reduce the number of sockets used by deepflow-agent when sending data.
- Merge sockets used for transmitting open_telemetry and open_telemetry_compressed data when integrating with OpenTelemetry.
- Merge sockets used for agent self-monitoring, transmitting deepflow_stats and agent_log data.
- Merge sockets used for transmitting prometheus and telegraf metrics when integrating with Prometheus and Telegraf.
- Reduce the number of sockets used by deepflow-agent when sending data.
#9. v6.6.2 [2024/08/01]
#9.1 Beta Feature
- AutoMetrics
- Support aligning timestamps of request and response metrics within the same session to help AIOps systems better perform root cause analysis (thanks to
pegasusljn: FR (opens new window)).
- Support aligning timestamps of request and response metrics within the same session to help AIOps systems better perform root cause analysis (thanks to
#9.2 Stable Feature
- AutoTracing
- Optimize default values for NTP clock offset (
host_clock_offset_us) and network delay (network_delay_us) configuration parameters used in network span tracing to reduce mismatch probability.
- Optimize default values for NTP clock offset (
#10. v6.6.1 [2024/07/18]
#10.1 Beta Feature
- AutoTagging
- Correctly tag Universal Tag for loopback NIC traffic on K8s Nodes.
#10.2 Stable Feature
- AutoTracing
- Add URL masking capability for HTTP protocol, enable Redis protocol masking by default.
- AutoTagging
- Support synchronizing Volcano Engine resource tags, documentation.
- Stop synchronizing Pods in K8s Evicted state to reduce resource overhead.
- Integration
- Optimize mapping of schema/target and other fields in OTel Span to
l7_flow_log, documentation.
- Optimize mapping of schema/target and other fields in OTel Span to
- Agent
- Support aggregated collection of traffic from multiple member physical NICs of an Open vSwitch bond interface.
#11. v6.6.0 [2024/07/04]
#11.1 Backward Incompatible Change
- AutoProfiling
- Use Dataframe return format to compress response size and improve API performance, PR (opens new window), documentation.
| #Functions | Response Size (Byte) | Download Time | |
|---|---|---|---|
| Before | 450,000 | 21.9M | 6.16s |
| After | 450,000 | 3.07M | 0.78s |
#11.2 Beta Feature
- AutoTagging
- Support synchronizing Volcano Engine resource tags, documentation.
- Agent
- Support aggregated collection of traffic from multiple member physical NICs of an Open vSwitch bond interface.