Tomorrow’s Tech, Today: Innovation That Moves Us Forward
- Cross-ecosystem worm propagation from the Python ecosystem into npm, enabling cascading, multi-ecosystem supply chain compromise.
- Multiple parallel exfiltration channels and comprehensive credential harvesting ensured data theft and persistence across cloud, GitHub, and local environments.
- Persistence via developer tool hooks and malicious workflows demanded immediate remediation: rotate credentials, audit repositories, and patch dependencies.
Introduction
On April 30, 2026, the Python ecosystem faced a significant supply chain attack when the PyPI package ‘lightning’ (PyTorch Lightning) was compromised in versions 2.6.2 and 2.6.3. This sophisticated attack, attributed to the same threat actor behind the previous “Mini Shai-Hulud” campaign, demonstrates the evolving sophistication of supply chain attacks targeting AI and machine learning infrastructure.
The attack is particularly concerning because PyTorch Lightning is a widely-used deep learning framework that appears in the dependency trees of countless projects building image classifiers, fine-tuning large language models, running diffusion models, and developing time-series forecasters. A single pip install lightning command was all that was needed to activate the malicious payload.
Attack Overview
Affected Versions
The compromise affected two specific versions:
lightningversion 2.6.2lightningversion 2.6.3
These versions contained hidden _runtime directories with obfuscated JavaScript payloads that executed automatically upon module import. This means any developer or CI/CD pipeline that installed these versions during the affected window became compromised.
Attack Scope
The malware’s reach extends far beyond the initial PyPI compromise. The attack includes sophisticated worm propagation mechanisms that spread to npm packages if npm publish credentials are discovered. This cross-ecosystem spread represents a new level of threat sophistication, where a single compromised package can trigger cascading attacks across multiple package management systems.
Technical Analysis
Payload Delivery Mechanism
The malicious code was delivered through a hidden _runtime directory containing:
- _runtime/start.py: A Python loader that initializes the payload on import
- _runtime/router_runtime.js: An obfuscated JavaScript payload (14.8 MB, Bun runtime)
The use of obfuscation and a large payload size suggests the attackers invested significant effort in evading detection. The 14.8 MB size is particularly notable, as it indicates the payload includes a complete Bun runtime environment.
Data Exfiltration Channels
The malware implements four parallel exfiltration channels, ensuring stolen data gets out even if individual paths are blocked:
-
HTTPS POST to C2: Direct communication with attacker-controlled servers over port 443. The domain and path are stored as encrypted strings, making static analysis more difficult.
-
GitHub Commit Search Dead-Drop: The malware polls GitHub’s commit search API for commit messages prefixed with “EveryBoiWeBuildIsAWormyBoi”, which carry double-base64-encoded tokens. This clever technique uses GitHub’s public infrastructure as a dead-drop for stolen credentials.
-
Attacker-Controlled Public GitHub Repo: The malware creates new public repositories with randomly chosen Dune-themed names and descriptions like “A Mini Shai-Hulud has Appeared”. Stolen credentials are committed as base64-encoded JSON files, with large files split into numbered chunks.
-
Victim’s Own Repository: If the malware obtains a GitHub server token (ghs_), it pushes stolen data directly to all branches of the victim’s own repository.
Credential Harvesting
The malware targets credentials across multiple systems:
Filesystem: Scans 80+ credential file paths for GitHub tokens (ghp_, gho_) and npm tokens (npm_), reading up to 5 MB per file.
Environment: Executes gh auth token and dumps all environment variables from process.env.
GitHub Actions: On Linux runners, dumps Runner.Worker process memory via embedded Python and extracts all secrets marked “isSecret”:true, along with GITHUB_REPOSITORY and GITHUB_WORKFLOW.
Cloud Providers:
- AWS: Attempts environment variables, ~/.aws/credentials profiles, IMDSv2, and ECS endpoints to call sts:GetCallerIdentity. Enumerates and fetches all Secrets Manager values and SSM parameters.
- Azure: Uses DefaultAzureCredential to enumerate subscriptions and access Key Vault secrets.
- GCP: Authenticates via GoogleAuth and enumerates Secret Manager secrets.
This comprehensive credential harvesting approach means any machine that imported the malicious package during the affected window should be treated as fully compromised.
Persistence Mechanisms
Once inside a repository, the malware plants persistence hooks targeting two critical developer tools:
Claude Code Integration: The malware writes a SessionStart hook into .claude/settings.json with matcher: "*", pointing to node .vscode/setup.mjs. This hook fires every time a developer opens Claude Code in the infected repository — no tool use or user action required.
VS Code Integration: A parallel hook targets VS Code users via a runOn: folderOpen task in .vscode/tasks.json that runs node .claude/setup.mjs every time the project folder is opened.
Dropper Execution: Both hooks invoke setup.mjs, a self-contained Bun runtime bootstrapper. If Bun isn’t installed, it silently downloads bun-v1.3.13 from GitHub releases, handling multiple architectures (Linux x64/arm64/musl, macOS x64/arm64, Windows x64/arm64).
Malicious GitHub Actions Workflow: If the malware holds a GitHub token with write access, it pushes a workflow named “Formatter” to the victim’s repository. On every push, it dumps all repository secrets via ${{ toJSON(secrets) }} and uploads them as downloadable Actions artifacts.
Indicators of Compromise
Organizations should search for these indicators:
- Commit messages prefixed with “EveryBoiWeBuildIsAWormyBoi”
- GitHub repositories with description “A Mini Shai-Hulud has Appeared”
- Unexpected files in
.claude/and.vscode/directories - Unexpected GitHub Actions workflows named “Formatter”
- Suspicious entries in
.vscode/tasks.jsonor.claude/settings.json
Remediation Steps
Immediate Actions
-
Identify Affected Systems: Check all development machines and CI/CD pipelines for lightning versions 2.6.2 or 2.6.3.
-
Upgrade Immediately: Update to a patched version of lightning (2.6.4 or later).
-
Credential Rotation: Rotate all GitHub tokens, cloud credentials, and API keys that may have been present in affected environments.
-
Repository Audit: Search all repositories for the indicators of compromise listed above.
-
CI/CD Review: Examine CI/CD logs for suspicious activity, particularly around the time the malicious package was installed.
Long-Term Mitigation
-
Dependency Scanning: Implement automated tools like Semgrep to detect malicious dependencies before they reach production.
-
Supply Chain Verification: Use tools that verify package integrity and check for known vulnerabilities.
-
Principle of Least Privilege: Limit the permissions of CI/CD tokens and development credentials.
-
Code Review: Implement mandatory code review processes that include security scanning.
-
Monitoring: Set up alerts for unusual credential access patterns or unexpected repository modifications.
Broader Implications
Rising Threat Landscape
This attack is part of a concerning trend. According to security research, there have been 7 major supply chain attacks in the past 12 months, compared to only 9 in the two decades before that. The frequency and sophistication of these attacks are increasing dramatically.
Cross-Ecosystem Attacks
Unlike the previous Mini Shai-Hulud campaign that targeted npm directly, this attack demonstrates the ability to compromise one ecosystem (PyPI) and use it as a springboard to attack another (npm). This cross-ecosystem approach significantly expands the potential impact.
AI/ML Infrastructure Vulnerability
The targeting of PyTorch Lightning specifically highlights the vulnerability of AI and machine learning infrastructure. These tools are often installed in high-value environments with access to significant computational resources and sensitive data.
Community Response
Semgrep has released advisories and detection rules to help organizations identify compromised packages. The security community is actively tracking the threat actor’s infrastructure and working to disrupt their operations.
However, the incident highlights a systemic issue: the Python ecosystem, like many package management systems, lacks robust mechanisms for detecting and preventing supply chain attacks before packages reach users.
Lessons Learned
For Developers
- Verify Package Sources: Always verify that packages come from official sources.
- Use Lock Files: Implement dependency lock files to ensure reproducible builds.
- Monitor Dependencies: Regularly audit your dependency tree for known vulnerabilities.
- Principle of Least Privilege: Run development tools with minimal necessary permissions.
For Package Maintainers
- Secure Credentials: Protect PyPI credentials with strong authentication and minimal scope.
- Code Review: Implement mandatory code review processes.
- Automated Testing: Use automated security scanning in your CI/CD pipeline.
- Transparency: Communicate security incidents promptly and clearly.
For Package Repositories
- Enhanced Scanning: Implement more sophisticated malware detection.
- Behavioral Analysis: Monitor for suspicious patterns in package behavior.
- Rapid Response: Develop faster processes for removing malicious packages.
- Community Coordination: Work with security researchers to identify threats early.
Conclusion
The Shai-Hulud malware in PyTorch Lightning represents a sophisticated and concerning evolution in supply chain attacks. The attack’s use of multiple exfiltration channels, cross-ecosystem propagation, and persistence mechanisms demonstrates that threat actors are investing significant resources in compromising critical infrastructure.
While the immediate threat has been addressed through package updates and security advisories, the incident serves as a stark reminder of the vulnerabilities in our software supply chains. Organizations must implement comprehensive security practices, including dependency scanning, credential management, and continuous monitoring.
The Python ecosystem, and particularly the AI/ML community, must work together to develop more robust defenses against supply chain attacks. This includes better tooling for detecting malicious packages, stronger authentication mechanisms for package maintainers, and more rapid response procedures for removing compromised packages.
For organizations that may have been affected, immediate action is required: identify affected systems, rotate credentials, and audit repositories for signs of compromise. The threat is real, but with proper vigilance and security practices, the impact can be minimized.
For more information and detection rules, visit: https://semgrep.dev/blog/2026/malicious-dependency-in-pytorch-lightning-used-for-ai-training/
In case you have found a mistake in the text, please send a message to the author by selecting the mistake and pressing Ctrl-Enter.
Read the full article on the original site

