Static Analysis of The DeepSeek Android App (#1) · Issues · Gilbert Farris / goftesh

Static Analysis of The DeepSeek Android App

I performed a fixed analysis of DeepSeek, a Chinese LLM chatbot, utilizing version 1.8.0 from the Google Play Store. The objective was to recognize possible security and privacy issues.

I have actually written about DeepSeek previously here.

Additional security and privacy concerns about DeepSeek have actually been raised.

See also this analysis by NowSecure of the iPhone variation of DeepSeek

The findings detailed in this report are based purely on static analysis. This means that while the code exists within the app, there is no definitive proof that all of it is executed in practice. Nonetheless, the existence of such code warrants scrutiny, specifically given the growing issues around data privacy, monitoring, the prospective abuse of AI-driven applications, and cyber-espionage characteristics between international powers.

Key Findings

Suspicious Data Handling & Exfiltration

- Hardcoded URLs direct data to external servers, raising concerns about user activity monitoring, such as to ByteDance "volce.com" endpoints. NowSecure determines these in the iPhone app yesterday as well.

Bespoke encryption and information obfuscation approaches are present, with indicators that they might be used to exfiltrate user details.
The app contains hard-coded public keys, instead of counting on the user device's chain of trust.
UI interaction tracking catches detailed user behavior without clear approval.
WebView manipulation is present, which might permit for the app to gain access to personal external web browser information when links are opened. More details about WebView controls is here

Device Fingerprinting & Tracking

A significant portion of the analyzed code appears to concentrate on gathering device-specific details, which can be utilized for tracking and fingerprinting.

- The app collects different unique device identifiers, including UDID, Android ID, IMEI, pipewiki.org IMSI, and provider details.
System residential or commercial properties, set up bundles, and root detection systems recommend prospective anti-tampering measures. E.g. probes for the existence of Magisk, a tool that privacy advocates and security scientists utilize to root their Android gadgets.
Geolocation and network profiling exist, showing possible tracking abilities and making it possible for forum.pinoo.com.tr or disabling of fingerprinting regimes by area. - Hardcoded gadget design lists recommend the application might act differently depending on the identified hardware.
Multiple vendor-specific services are utilized to extract extra gadget details. E.g. if it can not figure out the gadget through basic Android SIM lookup (because approval was not approved), it attempts maker particular extensions to access the exact same details.

Potential Malware-Like Behavior

While no conclusive conclusions can be drawn without dynamic analysis, numerous observed habits align with recognized spyware and malware patterns:

- The app uses reflection and UI overlays, which could facilitate unapproved screen capture or phishing attacks.
SIM card details, serial numbers, and other device-specific information are aggregated for unidentified purposes.
The app carries out country-based gain access to constraints and "risk-device" detection, suggesting possible surveillance systems.
The app implements calls to fill Dex modules, where extra code is loaded from files with a.so extension at runtime.
The.so files themselves turn around and make additional calls to dlopen(), which can be utilized to pack additional.so files. This facility is not normally examined by Google Play Protect and other fixed analysis services.
The.so files can be carried out in native code, such as C++. Using native code includes a layer of intricacy to the analysis procedure and obscures the full level of the app's capabilities. Moreover, native code can be leveraged to more easily intensify advantages, potentially exploiting vulnerabilities within the os or device hardware.

Remarks

While information collection prevails in modern-day applications for debugging and enhancing user experience, aggressive fingerprinting raises significant personal privacy concerns. The DeepSeek app requires users to log in with a valid email, which need to already supply sufficient authentication. There is no valid reason for the app to strongly collect and transmit special gadget identifiers, IMEI numbers, SIM card details, and other non-resettable system residential or commercial properties.

The degree of tracking observed here goes beyond common analytics practices, potentially making it possible for consistent user tracking and re-identification across devices. These behaviors, combined with obfuscation methods and network communication with third-party tracking services, necessitate a greater level of examination from security scientists and users alike.

The employment of runtime code loading as well as the bundling of native code suggests that the app could enable the deployment and execution of unreviewed, from another location provided code. This is a major possible attack vector. No proof in this report exists that remotely released code execution is being done, only that the facility for this appears present.

Additionally, the app's method to spotting rooted gadgets appears extreme for an AI chatbot. Root detection is frequently justified in DRM-protected streaming services, where security and content security are vital, or in competitive computer game to avoid unfaithful. However, there is no clear rationale for such strict procedures in an application of this nature, raising more concerns about its intent.

Users and about installing DeepSeek ought to understand these potential risks. If this application is being used within a business or government environment, additional vetting and security controls need to be imposed before enabling its release on managed devices.

Disclaimer: The analysis provided in this report is based on fixed code review and does not indicate that all found functions are actively utilized. Further investigation is needed for conclusive conclusions.