feat: bundle last30days skill and harden truncated chat responses

This commit is contained in:
ilya-bov
2026-03-03 14:23:15 +03:00
parent 934327cd20
commit 96f065595d
176 changed files with 16280 additions and 1 deletions

View File

@@ -0,0 +1,592 @@
---
name: last30days
description: "Research a topic from the last 30 days. Also triggered by 'last30'. Sources: Reddit, X, YouTube, Hacker News, Polymarket, web. Become an expert and write copy-paste-ready prompts."
argument-hint: 'last30 AI video tools, last30 best project management tools'
allowed-tools: Bash, Read, Write, AskUserQuestion, WebSearch
homepage: https://github.com/mvanhorn/last30days-skill
user-invocable: true
metadata:
eggent:
emoji: "📰"
files:
- "scripts/*"
homepage: https://github.com/mvanhorn/last30days-skill
tags:
- research
- reddit
- x
- youtube
- hackernews
- trends
- prompts
---
# last30days v2.5: Research Any Topic from the Last 30 Days
Research ANY topic across Reddit, X, YouTube, Hacker News, Polymarket, and the web. Surface what people are actually discussing, recommending, betting on, and debating right now.
## CRITICAL: Parse User Intent
Before doing anything, parse the user's input for:
1. **TOPIC**: What they want to learn about (e.g., "web app mockups", "Claude Code skills", "image generation")
2. **TARGET TOOL** (if specified): Where they'll use the prompts (e.g., "Nano Banana Pro", "ChatGPT", "Midjourney")
3. **QUERY TYPE**: What kind of research they want:
- **PROMPTING** - "X prompts", "prompting for X", "X best practices" → User wants to learn techniques and get copy-paste prompts
- **RECOMMENDATIONS** - "best X", "top X", "what X should I use", "recommended X" → User wants a LIST of specific things
- **NEWS** - "what's happening with X", "X news", "latest on X" → User wants current events/updates
- **GENERAL** - anything else → User wants broad understanding of the topic
Common patterns:
- `[topic] for [tool]` → "web mockups for Nano Banana Pro" → TOOL IS SPECIFIED
- `[topic] prompts for [tool]` → "UI design prompts for Midjourney" → TOOL IS SPECIFIED
- Just `[topic]` → "iOS design mockups" → TOOL NOT SPECIFIED, that's OK
- "best [topic]" or "top [topic]" → QUERY_TYPE = RECOMMENDATIONS
- "what are the best [topic]" → QUERY_TYPE = RECOMMENDATIONS
**IMPORTANT: Do NOT ask about target tool before research.**
- If tool is specified in the query, use it
- If tool is NOT specified, run research first, then ask AFTER showing results
**Store these variables:**
- `TOPIC = [extracted topic]`
- `TARGET_TOOL = [extracted tool, or "unknown" if not specified]`
- `QUERY_TYPE = [RECOMMENDATIONS | NEWS | HOW-TO | GENERAL]`
**DISPLAY your parsing to the user.** Before running any tools, output:
```
I'll research {TOPIC} across Reddit, X, and the web to find what's been discussed in the last 30 days.
Parsed intent:
- TOPIC = {TOPIC}
- TARGET_TOOL = {TARGET_TOOL or "unknown"}
- QUERY_TYPE = {QUERY_TYPE}
Research typically takes 2-8 minutes (niche topics take longer). Starting now.
```
If TARGET_TOOL is known, mention it in the intro: "...to find {QUERY_TYPE}-style content for use in {TARGET_TOOL}."
This text MUST appear before you call any tools. It confirms to the user that you understood their request.
---
## Step 0.5: Resolve X Handle (if topic could have an X account)
If TOPIC looks like it could have its own X/Twitter account - **people, creators, brands, products, tools, companies, communities** (e.g., "Dor Brothers", "Jason Calacanis", "Nano Banana Pro", "Seedance", "Midjourney"), do ONE quick WebSearch:
```
WebSearch("{TOPIC} X twitter handle site:x.com")
```
From the results, extract their X/Twitter handle. Look for:
- **Verified profile URLs** like `x.com/{handle}` or `twitter.com/{handle}`
- Mentions like "@handle" in bios, articles, or social profiles
- "Follow @handle on X" patterns
**Verify the account is real, not a parody/fan account.** Check for:
- Verified/blue checkmark in the search results
- Official website linking to the X account
- Consistent naming (e.g., @thedorbrothers for "The Dor Brothers", not @DorBrosFan)
- If results only show fan/parody/news accounts (not the entity's own account), skip - the entity may not have an X presence
If you find a clear, verified handle, pass it as `--x-handle={handle}` (without @). This searches that account's posts directly - finding content they posted that doesn't mention their own name.
**Skip this step if:**
- TOPIC is clearly a generic concept, not an entity (e.g., "best rap songs 2026", "how to use Docker", "AI ethics debate")
- TOPIC already contains @ (user provided the handle directly)
- Using `--quick` depth
- WebSearch shows no official X account exists for this entity
Store: `RESOLVED_HANDLE = {handle or empty}`
---
## Agent Mode (--agent flag)
If `--agent` appears in ARGUMENTS (e.g., `/last30days plaud granola --agent`):
1. **Skip** the intro display block ("I'll research X across Reddit...")
2. **Skip** any `AskUserQuestion` calls - use `TARGET_TOOL = "unknown"` if not specified
3. **Run** the research script and WebSearch exactly as normal
4. **Skip** the "WAIT FOR USER RESPONSE" pause
5. **Skip** the follow-up invitation ("I'm now an expert on X...")
6. **Output** the complete research report and stop - do not wait for further input
Agent mode report format:
```
## Research Report: {TOPIC}
Generated: {date} | Sources: Reddit, X, YouTube, HN, Polymarket, Web
### Key Findings
[3-5 bullet points, highest-signal insights with citations]
### What I learned
{The full "What I learned" synthesis from normal output}
### Stats
{The standard stats block}
```
---
## Research Execution
**Step 1: Run the research script (FOREGROUND — do NOT background this)**
**CRITICAL: Run this command in the FOREGROUND with a 5-minute timeout. Do NOT use run_in_background. The full output contains Reddit, X, AND YouTube data that you need to read completely.**
**IMPORTANT: The script handles API key/Codex auth detection automatically.** Run it and check the output to determine mode.
```bash
# Find skill root — works in repo checkout, project scope, Claude Code, or Codex install
for dir in \
"." \
".meta/skills/last30days" \
"${PROJECT_ID:+data/projects/$PROJECT_ID/.meta/skills/last30days}" \
"${CLAUDE_PLUGIN_ROOT:-}" \
"$HOME/.claude/skills/last30days" \
"$HOME/.agents/skills/last30days" \
"$HOME/.codex/skills/last30days"; do
[ -n "$dir" ] && [ -f "$dir/scripts/last30days.py" ] && SKILL_ROOT="$dir" && break
done
if [ -z "${SKILL_ROOT:-}" ]; then
echo "ERROR: Could not find scripts/last30days.py" >&2
exit 1
fi
python3 "${SKILL_ROOT}/scripts/last30days.py" "$ARGUMENTS" --emit=compact # Add --x-handle=HANDLE if RESOLVED_HANDLE is set
```
Use a **timeout of 300000** (5 minutes) on the Bash call. The script typically takes 1-3 minutes.
The script will automatically:
- Detect available API keys
- Run Reddit/X/YouTube/Hacker News/Polymarket searches
- Output ALL results including YouTube transcripts, HN comments, and prediction market odds
**Read the ENTIRE output.** It contains SIX data sections in this order: Reddit items, X items, YouTube items, Hacker News items, Polymarket items, and WebSearch items. If you miss sections, you will produce incomplete stats.
**YouTube items in the output look like:** `**{video_id}** (score:N) {channel_name} [N views, N likes]` followed by a title, URL, and optional transcript snippet. Count them and include them in your synthesis and stats block.
---
## STEP 2: DO WEBSEARCH AFTER SCRIPT COMPLETES
After the script finishes, do WebSearch to supplement with blogs, tutorials, and news.
For **ALL modes**, do WebSearch to supplement (or provide all data in web-only mode).
Choose search queries based on QUERY_TYPE:
**If RECOMMENDATIONS** ("best X", "top X", "what X should I use"):
- Search for: `best {TOPIC} recommendations`
- Search for: `{TOPIC} list examples`
- Search for: `most popular {TOPIC}`
- Goal: Find SPECIFIC NAMES of things, not generic advice
**If NEWS** ("what's happening with X", "X news"):
- Search for: `{TOPIC} news 2026`
- Search for: `{TOPIC} announcement update`
- Goal: Find current events and recent developments
**If PROMPTING** ("X prompts", "prompting for X"):
- Search for: `{TOPIC} prompts examples 2026`
- Search for: `{TOPIC} techniques tips`
- Goal: Find prompting techniques and examples to create copy-paste prompts
**If GENERAL** (default):
- Search for: `{TOPIC} 2026`
- Search for: `{TOPIC} discussion`
- Goal: Find what people are actually saying
For ALL query types:
- **USE THE USER'S EXACT TERMINOLOGY** - don't substitute or add tech names based on your knowledge
- EXCLUDE reddit.com, x.com, twitter.com (covered by script)
- INCLUDE: blogs, tutorials, docs, news, GitHub repos
- **DO NOT output a separate "Sources:" block** — instead, include the top 3-5 web
source names as inline links on the 🌐 Web: stats line (see stats format below).
The WebSearch tool requires citation; satisfy it there, not as a trailing section.
**Options** (passed through from user's command):
- `--days=N` → Look back N days instead of 30 (e.g., `--days=7` for weekly roundup)
- `--quick` → Faster, fewer sources (8-12 each)
- (default) → Balanced (20-30 each)
- `--deep` → Comprehensive (50-70 Reddit, 40-60 X)
---
## Judge Agent: Synthesize All Sources
**After all searches complete, internally synthesize (don't display stats yet):**
The Judge Agent must:
1. Weight Reddit/X sources HIGHER (they have engagement signals: upvotes, likes)
2. Weight YouTube sources HIGH (they have views, likes, and transcript content)
3. Weight WebSearch sources LOWER (no engagement data)
4. Identify patterns that appear across ALL sources (strongest signals)
5. Note any contradictions between sources
6. Extract the top 3-5 actionable insights
7. **Cross-platform signals are the strongest evidence.** When items have `[also on: Reddit, HN]` or similar tags, it means the same story appears across multiple platforms. Lead with these cross-platform findings - they're the most important signals in the research.
### Prediction Markets (Polymarket)
**CRITICAL: When Polymarket returns relevant markets, prediction market odds are among the highest-signal data points in your research.** Real money on outcomes cuts through opinion. Treat them as strong evidence, not an afterthought.
**How to interpret and synthesize Polymarket data:**
1. **Prefer structural/long-term markets over near-term deadlines.** Championship odds > regular season title. Regime change > near-term strike deadline. IPO/major milestone > incremental update. Presidency > individual state primary. When multiple markets exist, the bigger question is more interesting to the user.
2. **When the topic is an outcome in a multi-outcome market, call out that specific outcome's odds and movement.** Don't just say "Polymarket has a #1 seed market" - say "Arizona has a 28% chance of being the #1 overall seed, up 10% this month." The user cares about THEIR topic's position in the market.
3. **Weave odds into the narrative as supporting evidence.** Don't isolate Polymarket data in its own paragraph. Instead: "Final Four buzz is building - Polymarket gives Arizona a 12% chance to win the championship (up 3% this week), and 28% to earn a #1 seed."
4. **Citation format:** Always include specific odds AND movement. "Polymarket has Arizona at 28% for a #1 seed (up 10% this month)" - not just "per Polymarket."
5. **When multiple relevant markets exist, highlight 3-5 of the most interesting ones** in your synthesis, ordered by importance (structural > near-term). Don't just pick the highest-volume one.
**Domain examples of market importance ranking:**
- **Sports:** Championship/tournament odds > conference title > regular season > weekly matchup
- **Geopolitics:** Regime change/structural outcomes > near-term strike deadlines > sanctions
- **Tech/Business:** IPO, major product launch, company milestones > incremental updates
- **Elections:** Presidency > primary > individual state
**Do NOT display stats here - they come at the end, right before the invitation.**
---
## FIRST: Internalize the Research
**CRITICAL: Ground your synthesis in the ACTUAL research content, not your pre-existing knowledge.**
Read the research output carefully. Pay attention to:
- **Exact product/tool names** mentioned (e.g., if research mentions "ClawdBot" or "@clawdbot", that's a DIFFERENT product than "Claude Code" - don't conflate them)
- **Specific quotes and insights** from the sources - use THESE, not generic knowledge
- **What the sources actually say**, not what you assume the topic is about
**ANTI-PATTERN TO AVOID**: If user asks about "clawdbot skills" and research returns ClawdBot content (self-hosted AI agent), do NOT synthesize this as "Claude Code skills" just because both involve "skills". Read what the research actually says.
### If QUERY_TYPE = RECOMMENDATIONS
**CRITICAL: Extract SPECIFIC NAMES, not generic patterns.**
When user asks "best X" or "top X", they want a LIST of specific things:
- Scan research for specific product names, tool names, project names, skill names, etc.
- Count how many times each is mentioned
- Note which sources recommend each (Reddit thread, X post, blog)
- List them by popularity/mention count
**BAD synthesis for "best Claude Code skills":**
> "Skills are powerful. Keep them under 500 lines. Use progressive disclosure."
**GOOD synthesis for "best Claude Code skills":**
> "Most mentioned skills: /commit (5 mentions), remotion skill (4x), git-worktree (3x), /pr (3x). The Remotion announcement got 16K likes on X."
### For all QUERY_TYPEs
Identify from the ACTUAL RESEARCH OUTPUT:
- **PROMPT FORMAT** - Does research recommend JSON, structured params, natural language, keywords?
- The top 3-5 patterns/techniques that appeared across multiple sources
- Specific keywords, structures, or approaches mentioned BY THE SOURCES
- Common pitfalls mentioned BY THE SOURCES
---
## THEN: Show Summary + Invite Vision
**Display in this EXACT sequence:**
**FIRST - What I learned (based on QUERY_TYPE):**
**If RECOMMENDATIONS** - Show specific things mentioned with sources:
```
🏆 Most mentioned:
[Tool Name] - {n}x mentions
Use Case: [what it does]
Sources: @handle1, @handle2, r/sub, blog.com
[Tool Name] - {n}x mentions
Use Case: [what it does]
Sources: @handle3, r/sub2, Complex
Notable mentions: [other specific things with 1-2 mentions]
```
**CRITICAL for RECOMMENDATIONS:**
- Each item MUST have a "Sources:" line with actual @handles from X posts (e.g., @LONGLIVE47, @ByDobson)
- Include subreddit names (r/hiphopheads) and web sources (Complex, Variety)
- Parse @handles from research output and include the highest-engagement ones
- Format naturally - tables work well for wide terminals, stacked cards for narrow
**If PROMPTING/NEWS/GENERAL** - Show synthesis and patterns:
CITATION RULE: Cite sources sparingly to prove research is real.
- In the "What I learned" intro: cite 1-2 top sources total, not every sentence
- In KEY PATTERNS: cite 1 source per pattern, short format: "per @handle" or "per r/sub"
- Do NOT include engagement metrics in citations (likes, upvotes) - save those for stats box
- Do NOT chain multiple citations: "per @x, @y, @z" is too much. Pick the strongest one.
CITATION PRIORITY (most to least preferred):
1. @handles from X — "per @handle" (these prove the tool's unique value)
2. r/subreddits from Reddit — "per r/subreddit"
3. YouTube channels — "per [channel name] on YouTube" (transcript-backed insights)
4. HN discussions — "per HN" or "per hn/username" (developer community signal)
5. Polymarket — "Polymarket has X at Y% (up/down Z%)" with specific odds and movement
6. Web sources — ONLY when Reddit/X/YouTube/HN/Polymarket don't cover that specific fact
The tool's value is surfacing what PEOPLE are saying, not what journalists wrote.
When both a web article and an X post cover the same fact, cite the X post.
URL FORMATTING: NEVER paste raw URLs in the output.
- **BAD:** "per https://www.rollingstone.com/music/music-news/kanye-west-bully-1235506094/"
- **GOOD:** "per Rolling Stone"
- **GOOD:** "per Complex"
Use the publication name, not the URL. The user doesn't need links — they need clean, readable text.
**BAD:** "His album is set for March 20 (per Rolling Stone; Billboard; Complex)."
**GOOD:** "His album BULLY drops March 20 — fans on X are split on the tracklist, per @honest30bgfan_"
**GOOD:** "Ye's apology got massive traction on r/hiphopheads"
**OK** (web, only when Reddit/X don't have it): "The Hellwatt Festival runs July 4-18 at RCF Arena, per Billboard"
**Lead with people, not publications.** Start each topic with what Reddit/X
users are saying/feeling, then add web context only if needed. The user came
here for the conversation, not the press release.
```
What I learned:
**{Topic 1}** — [1-2 sentences about what people are saying, per @handle or r/sub]
**{Topic 2}** — [1-2 sentences, per @handle or r/sub]
**{Topic 3}** — [1-2 sentences, per @handle or r/sub]
KEY PATTERNS from the research:
1. [Pattern] — per @handle
2. [Pattern] — per r/sub
3. [Pattern] — per @handle
```
**THEN - Stats (right before invitation):**
**CRITICAL: Calculate actual totals from the research output.**
- Count posts/threads from each section
- Sum engagement: parse `[Xlikes, Yrt]` from each X post, `[Xpts, Ycmt]` from Reddit
- Identify top voices: highest-engagement @handles from X, most active subreddits
**Copy this EXACTLY, replacing only the {placeholders}:**
```
---
✅ All agents reported back!
├─ 🟠 Reddit: {N} threads │ {N} upvotes │ {N} comments
├─ 🔵 X: {N} posts │ {N} likes │ {N} reposts
├─ 🔴 YouTube: {N} videos │ {N} views │ {N} with transcripts
├─ 🟡 HN: {N} stories │ {N} points │ {N} comments
├─ 📊 Polymarket: {N} markets │ {short summary of up to 5 most relevant market odds, e.g. "Championship: 12%, #1 Seed: 28%, Big 12: 64%, vs Kansas: 71%"}
├─ 🌐 Web: {N} pages — Source Name, Source Name, Source Name
└─ 🗣️ Top voices: @{handle1} ({N} likes), @{handle2} │ r/{sub1}, r/{sub2}
---
```
**WebSearch citation note:** The WebSearch tool requires source citation. This requirement is satisfied by naming the web sources on the 🌐 Web: line above (plain names, no URLs — URLs wrap badly in terminals). Do NOT append a separate "Sources:" section after the invitation.
**CRITICAL: Omit any source line that returned 0 results.** Do NOT show "0 threads", "0 stories", "0 markets", or "(no results this cycle)". If a source found nothing, DELETE that line entirely - don't include it at all.
NEVER use plain text dashes (-) or pipe (|). ALWAYS use ├─ └─ │ and the emoji.
**SELF-CHECK before displaying**: Re-read your "What I learned" section. Does it match what the research ACTUALLY says? If you catch yourself projecting your own knowledge instead of the research, rewrite it.
**LAST - Invitation (adapt to QUERY_TYPE):**
**CRITICAL: Every invitation MUST include 2-3 specific example suggestions based on what you ACTUALLY learned from the research.** Don't be generic — show the user you absorbed the content by referencing real things from the results.
**If QUERY_TYPE = PROMPTING:**
```
---
I'm now an expert on {TOPIC} for {TARGET_TOOL}. What do you want to make? For example:
- [specific idea based on popular technique from research]
- [specific idea based on trending style/approach from research]
- [specific idea riffing on what people are actually creating]
Just describe your vision and I'll write a prompt you can paste straight into {TARGET_TOOL}.
```
**If QUERY_TYPE = RECOMMENDATIONS:**
```
---
I'm now an expert on {TOPIC}. Want me to go deeper? For example:
- [Compare specific item A vs item B from the results]
- [Explain why item C is trending right now]
- [Help you get started with item D]
```
**If QUERY_TYPE = NEWS:**
```
---
I'm now an expert on {TOPIC}. Some things you could ask:
- [Specific follow-up question about the biggest story]
- [Question about implications of a key development]
- [Question about what might happen next based on current trajectory]
```
**If QUERY_TYPE = GENERAL:**
```
---
I'm now an expert on {TOPIC}. Some things I can help with:
- [Specific question based on the most discussed aspect]
- [Specific creative/practical application of what you learned]
- [Deeper dive into a pattern or debate from the research]
```
**Example invitations (to show the quality bar):**
For `/last30days nano banana pro prompts for Gemini`:
> I'm now an expert on Nano Banana Pro for Gemini. What do you want to make? For example:
> - Photorealistic product shots with natural lighting (the most requested style right now)
> - Logo designs with embedded text (Gemini's new strength per the research)
> - Multi-reference style transfer from a mood board
>
> Just describe your vision and I'll write a prompt you can paste straight into Gemini.
For `/last30days kanye west` (GENERAL):
> I'm now an expert on Kanye West. Some things I can help with:
> - What's the real story behind the apology letter — genuine or PR move?
> - Break down the BULLY tracklist reactions and what fans are expecting
> - Compare how Reddit vs X are reacting to the Bianca narrative
For `/last30days war in Iran` (NEWS):
> I'm now an expert on the Iran situation. Some things you could ask:
> - What are the realistic escalation scenarios from here?
> - How is this playing differently in US vs international media?
> - What's the economic impact on oil markets so far?
---
## WAIT FOR USER'S RESPONSE
After showing the stats summary with your invitation, **STOP and wait** for the user to respond.
---
## WHEN USER RESPONDS
**Read their response and match the intent:**
- If they ask a **QUESTION** about the topic → Answer from your research (no new searches, no prompt)
- If they ask to **GO DEEPER** on a subtopic → Elaborate using your research findings
- If they describe something they want to **CREATE** → Write ONE perfect prompt (see below)
- If they ask for a **PROMPT** explicitly → Write ONE perfect prompt (see below)
**Only write a prompt when the user wants one.** Don't force a prompt on someone who asked "what could happen next with Iran."
### Writing a Prompt
When the user wants a prompt, write a **single, highly-tailored prompt** using your research expertise.
### CRITICAL: Match the FORMAT the research recommends
**If research says to use a specific prompt FORMAT, YOU MUST USE THAT FORMAT.**
**ANTI-PATTERN**: Research says "use JSON prompts with device specs" but you write plain prose. This defeats the entire purpose of the research.
### Quality Checklist (run before delivering):
- [ ] **FORMAT MATCHES RESEARCH** - If research said JSON/structured/etc, prompt IS that format
- [ ] Directly addresses what the user said they want to create
- [ ] Uses specific patterns/keywords discovered in research
- [ ] Ready to paste with zero edits (or minimal [PLACEHOLDERS] clearly marked)
- [ ] Appropriate length and style for TARGET_TOOL
### Output Format:
```
Here's your prompt for {TARGET_TOOL}:
---
[The actual prompt IN THE FORMAT THE RESEARCH RECOMMENDS]
---
This uses [brief 1-line explanation of what research insight you applied].
```
---
## IF USER ASKS FOR MORE OPTIONS
Only if they ask for alternatives or more prompts, provide 2-3 variations. Don't dump a prompt pack unless requested.
---
## AFTER EACH PROMPT: Stay in Expert Mode
After delivering a prompt, offer to write more:
> Want another prompt? Just tell me what you're creating next.
---
## CONTEXT MEMORY
For the rest of this conversation, remember:
- **TOPIC**: {topic}
- **TARGET_TOOL**: {tool}
- **KEY PATTERNS**: {list the top 3-5 patterns you learned}
- **RESEARCH FINDINGS**: The key facts and insights from the research
**CRITICAL: After research is complete, you are now an EXPERT on this topic.**
When the user asks follow-up questions:
- **DO NOT run new WebSearches** - you already have the research
- **Answer from what you learned** - cite the Reddit threads, X posts, and web sources
- **If they ask a question** - answer it from your research findings
- **If they ask for a prompt** - write one using your expertise
Only do new research if the user explicitly asks about a DIFFERENT topic.
---
## Output Summary Footer (After Each Prompt)
After delivering a prompt, end with:
```
---
📚 Expert in: {TOPIC} for {TARGET_TOOL}
📊 Based on: {n} Reddit threads ({sum} upvotes) + {n} X posts ({sum} likes) + {n} YouTube videos ({sum} views) + {n} HN stories ({sum} points) + {n} web pages
Want another prompt? Just tell me what you're creating next.
```
---
## Security & Permissions
**What this skill does:**
- Sends search queries to OpenAI's Responses API (`api.openai.com`) for Reddit discovery
- Sends search queries to Twitter's GraphQL API (via browser cookie auth) or xAI's API (`api.x.ai`) for X search
- Sends search queries to Algolia HN Search API (`hn.algolia.com`) for Hacker News story and comment discovery (free, no auth)
- Sends search queries to Polymarket Gamma API (`gamma-api.polymarket.com`) for prediction market discovery (free, no auth)
- Runs `yt-dlp` locally for YouTube search and transcript extraction (no API key, public data)
- Optionally sends search queries to Brave Search API, Parallel AI API, or OpenRouter API for web search
- Fetches public Reddit thread data from `reddit.com` for engagement metrics
- Stores research findings in local SQLite database (watchlist mode only)
**What this skill does NOT do:**
- Does not post, like, or modify content on any platform
- Does not access your Reddit, X, or YouTube accounts
- Does not share API keys between providers (OpenAI key only goes to api.openai.com, etc.)
- Does not log, cache, or write API keys to output files
- Does not send data to any endpoint not listed above
- Hacker News and Polymarket sources are always available (no API key, no binary dependency)
- Can be invoked autonomously by agents via the Skill tool (runs inline, not forked); pass `--agent` for non-interactive report output
**Bundled scripts:** `scripts/last30days.py` (main research engine), `scripts/lib/` (search, enrichment, rendering modules), `scripts/lib/vendor/bird-search/` (vendored X search client, MIT licensed)
Review scripts before first use to verify behavior.

View File

@@ -0,0 +1,260 @@
#!/usr/bin/env python3
"""Morning briefing generator for last30days.
Synthesizes accumulated findings into formatted briefings.
The Python script collects the data; the agent (via SKILL.md) does the
beautiful synthesis. This script provides the structured data.
Usage:
python3 briefing.py generate # Daily briefing data
python3 briefing.py generate --weekly # Weekly digest data
python3 briefing.py show [--date DATE] # Show saved briefing
"""
import argparse
import json
import sys
from datetime import datetime, timedelta
from pathlib import Path
SCRIPT_DIR = Path(__file__).parent.resolve()
sys.path.insert(0, str(SCRIPT_DIR))
import store
BRIEFS_DIR = Path.home() / ".local" / "share" / "last30days" / "briefs"
def generate_daily(since: str = None) -> dict:
"""Generate daily briefing data.
Returns structured data for the agent to synthesize into a beautiful briefing.
"""
store.init_db()
topics = store.list_topics()
if not topics:
return {
"status": "no_topics",
"message": "No watchlist topics yet. Add one with: last30days watch add \"your topic\"",
}
enabled = [t for t in topics if t["enabled"]]
if not enabled:
return {
"status": "no_enabled",
"message": "All topics are paused. Enable a topic to generate briefings.",
}
# Default: findings since yesterday
if not since:
since = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
briefing_topics = []
total_new = 0
for topic in enabled:
findings = store.get_new_findings(topic["id"], since)
last_run = topic.get("last_run")
last_status = topic.get("last_status", "unknown")
# Calculate staleness
stale = False
hours_ago = None
if last_run:
try:
run_dt = datetime.fromisoformat(last_run.replace("Z", "+00:00"))
hours_ago = (datetime.now() - run_dt.replace(tzinfo=None)).total_seconds() / 3600
stale = hours_ago > 36 # Stale if > 36 hours
except (ValueError, TypeError):
stale = True
topic_data = {
"name": topic["name"],
"findings": findings,
"new_count": len(findings),
"last_run": last_run,
"last_status": last_status,
"stale": stale,
"hours_ago": round(hours_ago, 1) if hours_ago else None,
}
# Extract top finding by engagement
if findings:
top = max(findings, key=lambda f: f.get("engagement_score", 0))
topic_data["top_finding"] = {
"title": top.get("source_title", ""),
"source": top.get("source", ""),
"author": top.get("author", ""),
"engagement": top.get("engagement_score", 0),
"content": top.get("content", "")[:300],
}
briefing_topics.append(topic_data)
total_new += len(findings)
# Cost info
daily_cost = store.get_daily_cost()
budget = float(store.get_setting("daily_budget", "5.00"))
# Find the single top finding across all topics (for TL;DR)
all_findings = []
for t in briefing_topics:
for f in t["findings"]:
f["_topic"] = t["name"]
all_findings.append(f)
top_overall = None
if all_findings:
top_overall = max(all_findings, key=lambda f: f.get("engagement_score", 0))
result = {
"status": "ok",
"date": datetime.now().strftime("%Y-%m-%d"),
"since": since,
"topics": briefing_topics,
"total_new": total_new,
"total_topics": len(briefing_topics),
"top_finding": {
"title": top_overall.get("source_title", ""),
"topic": top_overall.get("_topic", ""),
"engagement": top_overall.get("engagement_score", 0),
} if top_overall else None,
"cost": {
"daily": daily_cost,
"budget": budget,
},
"failed_topics": [
t["name"] for t in briefing_topics if t["last_status"] == "failed"
],
}
# Save briefing data
_save_briefing(result)
return result
def generate_weekly() -> dict:
"""Generate weekly digest data with trend analysis."""
store.init_db()
week_ago = (datetime.now() - timedelta(days=7)).strftime("%Y-%m-%d")
two_weeks_ago = (datetime.now() - timedelta(days=14)).strftime("%Y-%m-%d")
topics = store.list_topics()
if not topics:
return {"status": "no_topics", "message": "No watchlist topics."}
weekly_topics = []
for topic in topics:
if not topic["enabled"]:
continue
# This week's findings
this_week = store.get_new_findings(topic["id"], week_ago)
# Last week's findings (for comparison)
conn = store._connect()
try:
last_week_rows = conn.execute(
"""SELECT * FROM findings
WHERE topic_id = ? AND first_seen >= ? AND first_seen < ? AND dismissed = 0
ORDER BY engagement_score DESC""",
(topic["id"], two_weeks_ago, week_ago),
).fetchall()
last_week = [dict(r) for r in last_week_rows]
finally:
conn.close()
this_engagement = sum(f.get("engagement_score", 0) for f in this_week)
last_engagement = sum(f.get("engagement_score", 0) for f in last_week)
# Trend calculation
if last_engagement > 0:
engagement_change = ((this_engagement - last_engagement) / last_engagement) * 100
else:
engagement_change = 100 if this_engagement > 0 else 0
weekly_topics.append({
"name": topic["name"],
"this_week_count": len(this_week),
"last_week_count": len(last_week),
"this_week_engagement": this_engagement,
"last_week_engagement": last_engagement,
"engagement_change_pct": round(engagement_change, 1),
"top_findings": this_week[:5], # Top 5 by engagement (already sorted)
})
result = {
"status": "ok",
"type": "weekly",
"week_of": week_ago,
"topics": weekly_topics,
}
_save_briefing(result, suffix="-weekly")
return result
def show_briefing(date: str = None) -> dict:
"""Load a saved briefing by date."""
if not date:
date = datetime.now().strftime("%Y-%m-%d")
path = BRIEFS_DIR / f"{date}.json"
if not path.exists():
# Try weekly
path = BRIEFS_DIR / f"{date}-weekly.json"
if not path.exists():
return {"status": "not_found", "message": f"No briefing found for {date}."}
with open(path) as f:
return json.load(f)
def _save_briefing(data: dict, suffix: str = ""):
"""Save briefing data to local archive."""
BRIEFS_DIR.mkdir(parents=True, exist_ok=True)
date = datetime.now().strftime("%Y-%m-%d")
path = BRIEFS_DIR / f"{date}{suffix}.json"
with open(path, "w") as f:
json.dump(data, f, indent=2, default=str)
def main():
parser = argparse.ArgumentParser(description="Generate last30days briefings")
sub = parser.add_subparsers(dest="command")
# generate
g = sub.add_parser("generate", help="Generate a briefing")
g.add_argument("--weekly", action="store_true", help="Weekly digest")
g.add_argument("--since", help="Findings since date (YYYY-MM-DD)")
# show
s = sub.add_parser("show", help="Show a saved briefing")
s.add_argument("--date", help="Date (YYYY-MM-DD, default: today)")
args = parser.parse_args()
if args.command == "generate":
if args.weekly:
result = generate_weekly()
else:
result = generate_daily(since=args.since)
print(json.dumps(result, indent=2, default=str))
elif args.command == "show":
result = show_briefing(date=args.date)
print(json.dumps(result, indent=2, default=str))
else:
parser.print_help()
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,120 @@
#!/usr/bin/env python3
"""Evaluate synthesis outputs using a blinded comparison rubric.
Reads from docs/comparison-results/synthesis/, evaluates each topic's
3 versions (base, hn, cross) on a 5-dimension rubric.
Since we can't call the Anthropic API directly (no SDK installed),
this script formats the evaluation prompts for manual evaluation
and provides a framework for scoring.
"""
import random
from pathlib import Path
SYNTHESIS_DIR = Path(__file__).parent.parent / "docs" / "comparison-results" / "synthesis"
EVAL_DIR = Path(__file__).parent.parent / "docs" / "comparison-results" / "evaluation"
EVAL_DIR.mkdir(parents=True, exist_ok=True)
TOPICS = [
(1, 'claude-code', 'Claude Code skills and MCP servers', 'GENERAL'),
(2, 'seedance', 'Seedance AI video generation', 'NEWS'),
(3, 'macbook', 'M4 MacBook Pro review', 'RECOMMENDATIONS'),
(4, 'rap', 'best rap songs 2026', 'RECOMMENDATIONS'),
(5, 'react-svelte', 'React vs Svelte 2026', 'GENERAL'),
]
VERSIONS = ['base', 'hn', 'cross']
RUBRIC = """## Evaluation Rubric
Score each version 1-5 on these dimensions:
### 1. GROUNDEDNESS (30%)
Does the narrative cite specific sources from the research data?
- 1: Generic statements, no citations, could be written without any research
- 3: Some citations but mixed with pre-existing knowledge filler
- 5: Every finding backed by a specific source ("per @handle", "per r/sub", "per [channel]")
### 2. SPECIFICITY (25%)
Are findings specific (named entities, exact numbers) or vague?
- 1: Vague generalities ("AI video tools are improving", "developers are debating frameworks")
- 3: Some specifics mixed with generic padding
- 5: Named products, exact numbers, version names ("Seedance 2.0 added lip sync", "698 likes")
### 3. COVERAGE (20%)
Does the synthesis represent findings from all available data sources?
- 1: Only mentions 1-2 sources, ignores others
- 3: Mentions most sources but unevenly weighted
- 5: Naturally weaves Reddit, X, YouTube (and HN if available) into the narrative
### 4. ACTIONABILITY (15%)
Does the invitation give specific, research-derived next steps?
- 1: Generic "let me know if you want more info"
- 3: Somewhat specific but not clearly grounded in research findings
- 5: Each suggestion references a specific thing from the research ("I can compare Seedance 2.0 vs Kling")
### 5. FORMAT COMPLIANCE (10%)
Does it follow the expected output format?
- 1: Missing stats block, no invitation, wrong structure
- 3: Partial stats block, generic invitation
- 5: Perfect stats block with real counts, source box-drawing chars, top voices identified
"""
for num, slug, topic, qtype in TOPICS:
# Randomly assign labels to prevent position bias
versions_shuffled = list(VERSIONS)
random.seed(num * 42) # Deterministic but different per topic
random.shuffle(versions_shuffled)
label_map = {v: chr(65 + i) for i, v in enumerate(versions_shuffled)}
reverse_map = {chr(65 + i): v for i, v in enumerate(versions_shuffled)}
lines = []
lines.append(f"# Evaluation: {topic}")
lines.append(f"")
lines.append(f"**Query Type:** {qtype}")
lines.append(f"**Label Map (REVEAL AFTER SCORING):** {reverse_map}")
lines.append(f"")
lines.append(RUBRIC)
lines.append("")
for v in versions_shuffled:
label = label_map[v]
synthesis_file = SYNTHESIS_DIR / f"{v}-{num}-{slug}.md"
if synthesis_file.exists():
content = synthesis_file.read_text()
else:
content = f"[FILE NOT FOUND: {synthesis_file}]"
lines.append(f"---")
lines.append(f"## VERSION {label}")
lines.append(f"")
lines.append(content)
lines.append(f"")
lines.append("---")
lines.append("## SCORES")
lines.append("")
for v in versions_shuffled:
label = label_map[v]
lines.append(f"### Version {label}")
lines.append(f"- Groundedness: /5")
lines.append(f"- Specificity: /5")
lines.append(f"- Coverage: /5")
lines.append(f"- Actionability: /5")
lines.append(f"- Format: /5")
lines.append(f"- **Weighted Total**: /5.0")
lines.append(f"- Best/worst aspect: ")
lines.append(f"")
lines.append("## VERDICT")
lines.append("")
lines.append(f"**Winner for {topic}:** ")
lines.append(f"**Why:** ")
lines.append("")
lines.append(f"**Reveal:** {reverse_map}")
eval_file = EVAL_DIR / f"eval-{num}-{slug}.md"
eval_file.write_text("\n".join(lines))
print(f" {eval_file.name}: {len(lines)} lines, labels: {reverse_map}")
print(f"\n{len(TOPICS)} evaluation files written to {EVAL_DIR}")
print("Next step: Read each file, score the versions, fill in SCORES section")

View File

@@ -0,0 +1,53 @@
#!/usr/bin/env python3
"""Convert JSON result files to compact markdown using render_compact().
Reads from docs/comparison-results/json/, writes to docs/comparison-results/compact/.
Uses the current checkout's render_compact() - since version differences are in the
DATA (cross_refs, HN items, YouTube relevance), not in the render function.
"""
import json
import sys
from pathlib import Path
# Add scripts/ to path so we can import lib
sys.path.insert(0, str(Path(__file__).parent))
from lib.schema import Report
from lib.render import render_compact, render_source_status
JSON_DIR = Path(__file__).parent.parent / "docs" / "comparison-results" / "json"
COMPACT_DIR = Path(__file__).parent.parent / "docs" / "comparison-results" / "compact"
COMPACT_DIR.mkdir(parents=True, exist_ok=True)
files = sorted(JSON_DIR.glob("*.json"))
files = [f for f in files if f.name != "diagnose-baseline.json"]
print(f"Converting {len(files)} JSON files to compact markdown...\n")
for json_file in files:
with open(json_file) as f:
data = json.load(f)
report = Report.from_dict(data)
compact = render_compact(report)
source_status = render_source_status(report)
full_output = compact + "\n" + source_status
md_file = COMPACT_DIR / json_file.name.replace(".json", ".md")
md_file.write_text(full_output)
# Summary stats
n_reddit = len(report.reddit)
n_x = len(report.x)
n_yt = len(report.youtube)
n_hn = len(report.hackernews)
n_web = len(report.web)
xrefs = sum(1 for r in report.reddit if r.cross_refs)
xrefs += sum(1 for x in report.x if x.cross_refs)
xrefs += sum(1 for y in report.youtube if y.cross_refs)
xrefs += sum(1 for h in report.hackernews if h.cross_refs)
print(f" {json_file.name:40s} -> {len(full_output):5d} chars "
f"(R:{n_reddit} X:{n_x} YT:{n_yt} HN:{n_hn} W:{n_web} xref:{xrefs})")
print(f"\nDone. {len(files)} compact files written to {COMPACT_DIR}")

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1 @@
# last30days library modules

View File

@@ -0,0 +1,476 @@
"""Bird X search client - vendored Twitter GraphQL search for /last30days v2.1.
Uses a vendored subset of @steipete/bird v0.8.0 (MIT License) to search X
via Twitter's GraphQL API. No external `bird` CLI binary needed - just Node.js 22+.
"""
import json
import os
import signal
import shutil
import subprocess
import sys
from pathlib import Path
from datetime import datetime
from typing import Any, Dict, List, Optional, Tuple
# Path to the vendored bird-search wrapper
_BIRD_SEARCH_MJS = Path(__file__).parent / "vendor" / "bird-search" / "bird-search.mjs"
# Depth configurations: number of results to request
DEPTH_CONFIG = {
"quick": 12,
"default": 30,
"deep": 60,
}
# Module-level credentials injected from .env config
_credentials: Dict[str, str] = {}
def set_credentials(auth_token: Optional[str], ct0: Optional[str]):
"""Inject AUTH_TOKEN/CT0 from .env config so Node subprocesses can use them."""
if auth_token:
_credentials['AUTH_TOKEN'] = auth_token
if ct0:
_credentials['CT0'] = ct0
def _subprocess_env() -> Dict[str, str]:
"""Build env dict for Node subprocesses, merging injected credentials."""
env = os.environ.copy()
env.update(_credentials)
return env
def _log(msg: str):
"""Log to stderr."""
sys.stderr.write(f"[Bird] {msg}\n")
sys.stderr.flush()
def _extract_core_subject(topic: str) -> str:
"""Extract core subject from verbose query for X search.
X search is literal keyword AND matching — all words must appear.
Aggressively strip question/meta/research words to keep only the
core product/concept name (2-3 words max).
"""
text = topic.lower().strip()
# Phase 1: Strip multi-word prefixes (longest first)
prefixes = [
'what are the best', 'what is the best', 'what are the latest',
'what are people saying about', 'what do people think about',
'how do i use', 'how to use', 'how to',
'what are', 'what is', 'tips for', 'best practices for',
]
for p in prefixes:
if text.startswith(p + ' '):
text = text[len(p):].strip()
break
# Phase 2: Strip multi-word suffixes
suffixes = [
'best practices', 'use cases', 'prompt techniques',
'prompting techniques', 'prompting tips',
]
for s in suffixes:
if text.endswith(' ' + s):
text = text[:-len(s)].strip()
break
# Phase 3: Filter individual noise words
_noise = {
# Question/filler words
'a', 'an', 'the', 'is', 'are', 'was', 'were', 'and', 'or',
'of', 'in', 'on', 'for', 'with', 'about', 'to',
'people', 'saying', 'think', 'said', 'lately',
# Research/meta descriptors
'best', 'top', 'good', 'great', 'awesome', 'killer',
'latest', 'new', 'news', 'update', 'updates',
'trendiest', 'trending', 'hottest', 'hot', 'popular', 'viral',
'practices', 'features', 'guide', 'tutorial',
'recommendations', 'advice', 'review', 'reviews',
'usecases', 'examples', 'comparison', 'versus', 'vs',
'plugin', 'plugins', 'skill', 'skills', 'tool', 'tools',
# Prompting meta words
'prompt', 'prompts', 'prompting', 'techniques', 'tips',
'tricks', 'methods', 'strategies', 'approaches',
# Action words
'using', 'uses', 'use',
}
words = text.split()
result = [w for w in words if w not in _noise]
return ' '.join(result[:3]) or topic.lower().strip() # Max 3 words
def is_bird_installed() -> bool:
"""Check if vendored Bird search module is available.
Returns:
True if bird-search.mjs exists and Node.js 22+ is in PATH.
"""
if not _BIRD_SEARCH_MJS.exists():
return False
return shutil.which("node") is not None
def is_bird_authenticated() -> Optional[str]:
"""Check if X credentials are available (env vars or browser cookies).
Returns:
Auth source string if authenticated, None otherwise.
"""
if not is_bird_installed():
return None
try:
result = subprocess.run(
["node", str(_BIRD_SEARCH_MJS), "--whoami"],
capture_output=True,
text=True,
timeout=15,
env=_subprocess_env(),
)
if result.returncode == 0 and result.stdout.strip():
return result.stdout.strip().split('\n')[0]
return None
except (subprocess.TimeoutExpired, FileNotFoundError, subprocess.SubprocessError):
return None
def check_npm_available() -> bool:
"""Check if npm is available (kept for API compatibility).
Returns:
True if 'npm' command is available in PATH, False otherwise.
"""
return shutil.which("npm") is not None
def install_bird() -> Tuple[bool, str]:
"""No-op - Bird search is vendored in v2.1, no installation needed.
Returns:
Tuple of (success, message).
"""
if is_bird_installed():
return True, "Bird search is bundled with /last30days v2.1 - no installation needed."
if not shutil.which("node"):
return False, "Node.js 22+ is required for X search. Install Node.js first."
return False, f"Vendored bird-search.mjs not found at {_BIRD_SEARCH_MJS}"
def get_bird_status() -> Dict[str, Any]:
"""Get comprehensive Bird search status.
Returns:
Dict with keys: installed, authenticated, username, can_install
"""
installed = is_bird_installed()
auth_source = is_bird_authenticated() if installed else None
return {
"installed": installed,
"authenticated": auth_source is not None,
"username": auth_source, # Now returns auth source (e.g., "Safari", "env AUTH_TOKEN")
"can_install": True, # Always vendored in v2.1
}
def _run_bird_search(query: str, count: int, timeout: int) -> Dict[str, Any]:
"""Run a search using the vendored bird-search.mjs module.
Args:
query: Full search query string (including since: filter)
count: Number of results to request
timeout: Timeout in seconds
Returns:
Raw Bird JSON response or error dict.
"""
cmd = [
"node", str(_BIRD_SEARCH_MJS),
query,
"--count", str(count),
"--json",
]
# Use process groups for clean cleanup on timeout/kill
preexec = os.setsid if hasattr(os, 'setsid') else None
try:
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
preexec_fn=preexec,
env=_subprocess_env(),
)
# Register for cleanup tracking (if available)
try:
from last30days import register_child_pid, unregister_child_pid
register_child_pid(proc.pid)
except ImportError:
pass
try:
stdout, stderr = proc.communicate(timeout=timeout)
except subprocess.TimeoutExpired:
# Kill the entire process group
try:
os.killpg(os.getpgid(proc.pid), signal.SIGTERM)
except (ProcessLookupError, PermissionError, OSError):
proc.kill()
proc.wait(timeout=5)
return {"error": f"Search timed out after {timeout}s", "items": []}
finally:
try:
from last30days import unregister_child_pid
unregister_child_pid(proc.pid)
except (ImportError, Exception):
pass
if proc.returncode != 0:
error = stderr.strip() if stderr else "Bird search failed"
return {"error": error, "items": []}
output = stdout.strip() if stdout else ""
if not output:
return {"items": []}
return json.loads(output)
except json.JSONDecodeError as e:
return {"error": f"Invalid JSON response: {e}", "items": []}
except Exception as e:
return {"error": str(e), "items": []}
def search_x(
topic: str,
from_date: str,
to_date: str,
depth: str = "default",
) -> Dict[str, Any]:
"""Search X using Bird CLI with automatic retry on 0 results.
Args:
topic: Search topic
from_date: Start date (YYYY-MM-DD)
to_date: End date (YYYY-MM-DD) - unused but kept for API compatibility
depth: Research depth - "quick", "default", or "deep"
Returns:
Raw Bird JSON response or error dict.
"""
count = DEPTH_CONFIG.get(depth, DEPTH_CONFIG["default"])
timeout = 30 if depth == "quick" else 45 if depth == "default" else 60
# Extract core subject - X search is literal, not semantic
core_topic = _extract_core_subject(topic)
query = f"{core_topic} since:{from_date}"
_log(f"Searching: {query}")
response = _run_bird_search(query, count, timeout)
# Check if we got results
items = parse_bird_response(response)
# Retry with fewer keywords if 0 results and query has 3+ words
core_words = core_topic.split()
if not items and len(core_words) > 2:
shorter = ' '.join(core_words[:2])
_log(f"0 results for '{core_topic}', retrying with '{shorter}'")
query = f"{shorter} since:{from_date}"
response = _run_bird_search(query, count, timeout)
items = parse_bird_response(response)
# Last-chance retry: use strongest remaining token (often the product name)
if not items and core_words:
low_signal = {
'trendiest', 'trending', 'hottest', 'hot', 'popular', 'viral',
'best', 'top', 'latest', 'new', 'plugin', 'plugins',
'skill', 'skills', 'tool', 'tools',
}
candidates = [w for w in core_words if w not in low_signal]
if candidates:
strongest = max(candidates, key=len)
_log(f"0 results for '{core_topic}', retrying with strongest token '{strongest}'")
query = f"{strongest} since:{from_date}"
response = _run_bird_search(query, count, timeout)
return response
def search_handles(
handles: List[str],
topic: Optional[str],
from_date: str,
count_per: int = 5,
) -> List[Dict[str, Any]]:
"""Search specific X handles for topic-related content.
Runs targeted Bird searches using `from:handle topic` syntax.
Used in Phase 2 supplemental search after entity extraction.
Args:
handles: List of X handles to search (without @)
topic: Search topic (core subject), or None for unfiltered search
from_date: Start date (YYYY-MM-DD)
count_per: Results to request per handle
Returns:
List of raw item dicts (same format as parse_bird_response output).
"""
all_items = []
core_topic = _extract_core_subject(topic) if topic else None
for handle in handles:
handle = handle.lstrip("@")
if core_topic:
query = f"from:{handle} {core_topic} since:{from_date}"
else:
query = f"from:{handle} since:{from_date}"
cmd = [
"node", str(_BIRD_SEARCH_MJS),
query,
"--count", str(count_per),
"--json",
]
preexec = os.setsid if hasattr(os, 'setsid') else None
try:
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
preexec_fn=preexec,
)
try:
stdout, stderr = proc.communicate(timeout=15)
except subprocess.TimeoutExpired:
try:
os.killpg(os.getpgid(proc.pid), signal.SIGTERM)
except (ProcessLookupError, PermissionError, OSError):
proc.kill()
proc.wait(timeout=5)
_log(f"Handle search timed out for @{handle}")
continue
if proc.returncode != 0:
_log(f"Handle search failed for @{handle}: {(stderr or '').strip()}")
continue
output = (stdout or "").strip()
if not output:
continue
response = json.loads(output)
items = parse_bird_response(response)
all_items.extend(items)
except json.JSONDecodeError:
_log(f"Invalid JSON from handle search for @{handle}")
except Exception as e:
_log(f"Handle search error for @{handle}: {e}")
return all_items
def parse_bird_response(response: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Parse Bird response to match xai_x output format.
Args:
response: Raw Bird JSON response
Returns:
List of normalized item dicts matching xai_x.parse_x_response() format.
"""
items = []
# Check for errors
if "error" in response and response["error"]:
_log(f"Bird error: {response['error']}")
return items
# Bird returns a list of tweets directly or under a key
raw_items = response if isinstance(response, list) else response.get("items", response.get("tweets", []))
if not isinstance(raw_items, list):
return items
for i, tweet in enumerate(raw_items):
if not isinstance(tweet, dict):
continue
# Extract URL - Bird uses permanent_url or we construct from id
url = tweet.get("permanent_url") or tweet.get("url", "")
if not url and tweet.get("id"):
# Try different field structures Bird might use
author = tweet.get("author", {}) or tweet.get("user", {})
screen_name = author.get("username") or author.get("screen_name", "")
if screen_name:
url = f"https://x.com/{screen_name}/status/{tweet['id']}"
if not url:
continue
# Parse date from created_at/createdAt (e.g., "Wed Jan 15 14:30:00 +0000 2026")
date = None
created_at = tweet.get("createdAt") or tweet.get("created_at", "")
if created_at:
try:
# Try ISO format first (e.g., "2026-02-03T22:33:32Z")
# Check for ISO date separator, not just "T" (which appears in "Tue")
if len(created_at) > 10 and created_at[10] == "T":
dt = datetime.fromisoformat(created_at.replace("Z", "+00:00"))
else:
# Twitter format: "Wed Jan 15 14:30:00 +0000 2026"
dt = datetime.strptime(created_at, "%a %b %d %H:%M:%S %z %Y")
date = dt.strftime("%Y-%m-%d")
except (ValueError, TypeError):
pass
# Extract user info (Bird uses author.username, older format uses user.screen_name)
author = tweet.get("author", {}) or tweet.get("user", {})
author_handle = author.get("username") or author.get("screen_name", "") or tweet.get("author_handle", "")
# Build engagement dict (Bird uses camelCase: likeCount, retweetCount, etc.)
engagement = {
"likes": tweet.get("likeCount") or tweet.get("like_count") or tweet.get("favorite_count"),
"reposts": tweet.get("retweetCount") or tweet.get("retweet_count"),
"replies": tweet.get("replyCount") or tweet.get("reply_count"),
"quotes": tweet.get("quoteCount") or tweet.get("quote_count"),
}
# Convert to int where possible
for key in engagement:
if engagement[key] is not None:
try:
engagement[key] = int(engagement[key])
except (ValueError, TypeError):
engagement[key] = None
# Build normalized item
item = {
"id": f"X{i+1}",
"text": str(tweet.get("text", tweet.get("full_text", ""))).strip()[:500],
"url": url,
"author_handle": author_handle.lstrip("@"),
"date": date,
"engagement": engagement if any(v is not None for v in engagement.values()) else None,
"why_relevant": "", # Bird doesn't provide relevance explanations
"relevance": 0.7, # Default relevance, let score.py re-rank
}
items.append(item)
return items

View File

@@ -0,0 +1,213 @@
"""Brave Search web search for last30days skill.
Uses the Brave Search API as a fallback web search backend.
Simple, cheap (free tier: 2,000 queries/month), widely available.
API docs: https://api-dashboard.search.brave.com/app/documentation/web-search/get-started
"""
import html
import re
import sys
from datetime import datetime, timedelta, timezone
from typing import Any, Dict, List, Optional
from urllib.parse import urlencode, urlparse
from . import http
ENDPOINT = "https://api.search.brave.com/res/v1/web/search"
# Freshness codes: pd=24h, pw=7d, pm=31d
FRESHNESS_MAP = {1: "pd", 7: "pw", 31: "pm"}
# Domains to exclude (handled by Reddit/X search)
EXCLUDED_DOMAINS = {
"reddit.com", "www.reddit.com", "old.reddit.com",
"twitter.com", "www.twitter.com", "x.com", "www.x.com",
}
def search_web(
topic: str,
from_date: str,
to_date: str,
api_key: str,
depth: str = "default",
) -> List[Dict[str, Any]]:
"""Search the web via Brave Search API.
Args:
topic: Search topic
from_date: Start date (YYYY-MM-DD)
to_date: End date (YYYY-MM-DD)
api_key: Brave Search API key
depth: 'quick', 'default', or 'deep'
Returns:
List of result dicts with keys: url, title, snippet, source_domain, date, relevance
Raises:
http.HTTPError: On API errors
"""
count = {"quick": 8, "default": 15, "deep": 25}.get(depth, 15)
# Calculate days for freshness filter
days = _days_between(from_date, to_date)
freshness = _brave_freshness(days)
params = {
"q": topic,
"result_filter": "web,news",
"count": count,
"safesearch": "strict",
"text_decorations": 0,
"spellcheck": 0,
}
if freshness:
params["freshness"] = freshness
url = f"{ENDPOINT}?{urlencode(params)}"
sys.stderr.write(f"[Web] Searching Brave for: {topic}\n")
sys.stderr.flush()
response = http.request(
"GET",
url,
headers={"X-Subscription-Token": api_key},
timeout=15,
)
return _normalize_results(response, from_date, to_date)
def _days_between(from_date: str, to_date: str) -> int:
"""Calculate days between two YYYY-MM-DD dates."""
try:
d1 = datetime.strptime(from_date, "%Y-%m-%d")
d2 = datetime.strptime(to_date, "%Y-%m-%d")
return max(1, (d2 - d1).days)
except (ValueError, TypeError):
return 30
def _brave_freshness(days: Optional[int]) -> Optional[str]:
"""Convert days to Brave freshness parameter.
Uses canned codes for <=31d, explicit date range for longer periods.
"""
if days is None:
return None
code = next((v for d, v in sorted(FRESHNESS_MAP.items()) if days <= d), None)
if code:
return code
start = (datetime.now(timezone.utc) - timedelta(days=days)).strftime("%Y-%m-%d")
end = datetime.now(timezone.utc).strftime("%Y-%m-%d")
return f"{start}to{end}"
def _normalize_results(
response: Dict[str, Any],
from_date: str,
to_date: str,
) -> List[Dict[str, Any]]:
"""Convert Brave Search response to websearch item schema.
Merges news + web results, cleans HTML entities, filters excluded domains.
"""
items = []
# Merge news results (tend to be more recent) with web results
raw_results = (
response.get("news", {}).get("results", []) +
response.get("web", {}).get("results", [])
)
for i, result in enumerate(raw_results):
if not isinstance(result, dict):
continue
url = result.get("url", "")
if not url:
continue
# Skip excluded domains
try:
domain = urlparse(url).netloc.lower()
if domain in EXCLUDED_DOMAINS:
continue
if domain.startswith("www."):
domain = domain[4:]
except Exception:
domain = ""
title = _clean_html(str(result.get("title", "")).strip())
snippet = _clean_html(str(result.get("description", "")).strip())
if not title and not snippet:
continue
# Parse date from Brave's 'age' field or 'page_age'
date = _parse_brave_date(result.get("age"), result.get("page_age"))
date_confidence = "med" if date else "low"
items.append({
"id": f"W{i+1}",
"title": title[:200],
"url": url,
"source_domain": domain,
"snippet": snippet[:500],
"date": date,
"date_confidence": date_confidence,
"relevance": 0.6, # Brave doesn't provide relevance scores
"why_relevant": "",
})
sys.stderr.write(f"[Web] Brave: {len(items)} results\n")
sys.stderr.flush()
return items
def _clean_html(text: str) -> str:
"""Remove HTML tags and decode entities."""
text = re.sub(r"<[^>]*>", "", text)
text = html.unescape(text)
return text
def _parse_brave_date(age: Optional[str], page_age: Optional[str]) -> Optional[str]:
"""Parse Brave's age/page_age fields to YYYY-MM-DD.
Brave returns dates like "3 hours ago", "2 days ago", "January 24, 2026".
"""
text = age or page_age
if not text:
return None
text_lower = text.lower().strip()
now = datetime.now()
# "X hours ago" -> today
if re.search(r'\d+\s*hours?\s*ago', text_lower):
return now.strftime("%Y-%m-%d")
# "X days ago"
match = re.search(r'(\d+)\s*days?\s*ago', text_lower)
if match:
days = int(match.group(1))
if days <= 60:
return (now - timedelta(days=days)).strftime("%Y-%m-%d")
# "X weeks ago"
match = re.search(r'(\d+)\s*weeks?\s*ago', text_lower)
if match:
weeks = int(match.group(1))
return (now - timedelta(weeks=weeks)).strftime("%Y-%m-%d")
# ISO format: 2026-01-24T...
match = re.search(r'(\d{4}-\d{2}-\d{2})', text)
if match:
return match.group(1)
return None

View File

@@ -0,0 +1,165 @@
"""Caching utilities for last30days skill."""
import hashlib
import json
import os
import tempfile
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Optional
CACHE_DIR = Path.home() / ".cache" / "last30days"
DEFAULT_TTL_HOURS = 24
MODEL_CACHE_TTL_DAYS = 7
MODEL_CACHE_FILE = CACHE_DIR / "model_selection.json"
def ensure_cache_dir():
"""Ensure cache directory exists. Supports env override and sandbox fallback."""
global CACHE_DIR, MODEL_CACHE_FILE
env_dir = os.environ.get("LAST30DAYS_CACHE_DIR")
if env_dir:
CACHE_DIR = Path(env_dir)
MODEL_CACHE_FILE = CACHE_DIR / "model_selection.json"
try:
CACHE_DIR.mkdir(parents=True, exist_ok=True)
except PermissionError:
CACHE_DIR = Path(tempfile.gettempdir()) / "last30days" / "cache"
MODEL_CACHE_FILE = CACHE_DIR / "model_selection.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
def get_cache_key(topic: str, from_date: str, to_date: str, sources: str) -> str:
"""Generate a cache key from query parameters."""
key_data = f"{topic}|{from_date}|{to_date}|{sources}"
return hashlib.sha256(key_data.encode()).hexdigest()[:16]
def get_cache_path(cache_key: str) -> Path:
"""Get path to cache file."""
return CACHE_DIR / f"{cache_key}.json"
def is_cache_valid(cache_path: Path, ttl_hours: int = DEFAULT_TTL_HOURS) -> bool:
"""Check if cache file exists and is within TTL."""
if not cache_path.exists():
return False
try:
stat = cache_path.stat()
mtime = datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc)
now = datetime.now(timezone.utc)
age_hours = (now - mtime).total_seconds() / 3600
return age_hours < ttl_hours
except OSError:
return False
def load_cache(cache_key: str, ttl_hours: int = DEFAULT_TTL_HOURS) -> Optional[dict]:
"""Load data from cache if valid."""
cache_path = get_cache_path(cache_key)
if not is_cache_valid(cache_path, ttl_hours):
return None
try:
with open(cache_path, 'r') as f:
return json.load(f)
except (json.JSONDecodeError, OSError):
return None
def get_cache_age_hours(cache_path: Path) -> Optional[float]:
"""Get age of cache file in hours."""
if not cache_path.exists():
return None
try:
stat = cache_path.stat()
mtime = datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc)
now = datetime.now(timezone.utc)
return (now - mtime).total_seconds() / 3600
except OSError:
return None
def load_cache_with_age(cache_key: str, ttl_hours: int = DEFAULT_TTL_HOURS) -> tuple:
"""Load data from cache with age info.
Returns:
Tuple of (data, age_hours) or (None, None) if invalid
"""
cache_path = get_cache_path(cache_key)
if not is_cache_valid(cache_path, ttl_hours):
return None, None
age = get_cache_age_hours(cache_path)
try:
with open(cache_path, 'r') as f:
return json.load(f), age
except (json.JSONDecodeError, OSError):
return None, None
def save_cache(cache_key: str, data: dict):
"""Save data to cache."""
ensure_cache_dir()
cache_path = get_cache_path(cache_key)
try:
with open(cache_path, 'w') as f:
json.dump(data, f)
except OSError:
pass # Silently fail on cache write errors
def clear_cache():
"""Clear all cache files."""
if CACHE_DIR.exists():
for f in CACHE_DIR.glob("*.json"):
try:
f.unlink()
except OSError:
pass
# Model selection cache (longer TTL) — MODEL_CACHE_FILE is set at module level
# and updated by ensure_cache_dir() if env override or fallback is needed.
def load_model_cache() -> dict:
"""Load model selection cache."""
if not is_cache_valid(MODEL_CACHE_FILE, MODEL_CACHE_TTL_DAYS * 24):
return {}
try:
with open(MODEL_CACHE_FILE, 'r') as f:
return json.load(f)
except (json.JSONDecodeError, OSError):
return {}
def save_model_cache(data: dict):
"""Save model selection cache."""
ensure_cache_dir()
try:
with open(MODEL_CACHE_FILE, 'w') as f:
json.dump(data, f)
except OSError:
pass
def get_cached_model(provider: str) -> Optional[str]:
"""Get cached model selection for a provider."""
cache = load_model_cache()
return cache.get(provider)
def set_cached_model(provider: str, model: str):
"""Cache model selection for a provider."""
cache = load_model_cache()
cache[provider] = model
cache['updated_at'] = datetime.now(timezone.utc).isoformat()
save_model_cache(cache)

View File

@@ -0,0 +1,124 @@
"""Date utilities for last30days skill."""
from datetime import datetime, timedelta, timezone
from typing import Optional, Tuple
def get_date_range(days: int = 30) -> Tuple[str, str]:
"""Get the date range for the last N days.
Returns:
Tuple of (from_date, to_date) as YYYY-MM-DD strings
"""
today = datetime.now(timezone.utc).date()
from_date = today - timedelta(days=days)
return from_date.isoformat(), today.isoformat()
def parse_date(date_str: Optional[str]) -> Optional[datetime]:
"""Parse a date string in various formats.
Supports: YYYY-MM-DD, ISO 8601, Unix timestamp
"""
if not date_str:
return None
# Try Unix timestamp (from Reddit)
try:
ts = float(date_str)
return datetime.fromtimestamp(ts, tz=timezone.utc)
except (ValueError, TypeError):
pass
# Try ISO formats
formats = [
"%Y-%m-%d",
"%Y-%m-%dT%H:%M:%S",
"%Y-%m-%dT%H:%M:%SZ",
"%Y-%m-%dT%H:%M:%S%z",
"%Y-%m-%dT%H:%M:%S.%f%z",
]
for fmt in formats:
try:
return datetime.strptime(date_str, fmt).replace(tzinfo=timezone.utc)
except ValueError:
continue
return None
def timestamp_to_date(ts: Optional[float]) -> Optional[str]:
"""Convert Unix timestamp to YYYY-MM-DD string."""
if ts is None:
return None
try:
dt = datetime.fromtimestamp(ts, tz=timezone.utc)
return dt.date().isoformat()
except (ValueError, TypeError, OSError):
return None
def get_date_confidence(date_str: Optional[str], from_date: str, to_date: str) -> str:
"""Determine confidence level for a date.
Args:
date_str: The date to check (YYYY-MM-DD or None)
from_date: Start of valid range (YYYY-MM-DD)
to_date: End of valid range (YYYY-MM-DD)
Returns:
'high', 'med', or 'low'
"""
if not date_str:
return 'low'
try:
dt = datetime.strptime(date_str, "%Y-%m-%d").date()
start = datetime.strptime(from_date, "%Y-%m-%d").date()
end = datetime.strptime(to_date, "%Y-%m-%d").date()
if start <= dt <= end:
return 'high'
elif dt < start:
# Older than range
return 'low'
else:
# Future date (suspicious)
return 'low'
except ValueError:
return 'low'
def days_ago(date_str: Optional[str]) -> Optional[int]:
"""Calculate how many days ago a date is.
Returns None if date is invalid or missing.
"""
if not date_str:
return None
try:
dt = datetime.strptime(date_str, "%Y-%m-%d").date()
today = datetime.now(timezone.utc).date()
delta = today - dt
return delta.days
except ValueError:
return None
def recency_score(date_str: Optional[str], max_days: int = 30) -> int:
"""Calculate recency score (0-100).
0 days ago = 100, max_days ago = 0, clamped.
"""
age = days_ago(date_str)
if age is None:
return 0 # Unknown date gets worst score
if age < 0:
return 100 # Future date (treat as today)
if age >= max_days:
return 0
return int(100 * (1 - age / max_days))

View File

@@ -0,0 +1,250 @@
"""Near-duplicate detection for last30days skill."""
import re
from typing import List, Set, Tuple, Union
from . import schema
# Stopwords for token-based Jaccard (cross-source linking)
STOPWORDS = frozenset({
'the', 'a', 'an', 'to', 'for', 'how', 'is', 'in', 'of', 'on',
'and', 'with', 'from', 'by', 'at', 'this', 'that', 'it', 'my',
'your', 'i', 'me', 'we', 'you', 'what', 'are', 'do', 'can',
'its', 'be', 'or', 'not', 'no', 'so', 'if', 'but', 'about',
'all', 'just', 'get', 'has', 'have', 'was', 'will', 'show', 'hn',
})
def normalize_text(text: str) -> str:
"""Normalize text for comparison.
- Lowercase
- Remove punctuation
- Collapse whitespace
"""
text = text.lower()
text = re.sub(r'[^\w\s]', ' ', text)
text = re.sub(r'\s+', ' ', text)
return text.strip()
def get_ngrams(text: str, n: int = 3) -> Set[str]:
"""Get character n-grams from text."""
text = normalize_text(text)
if len(text) < n:
return {text}
return {text[i:i+n] for i in range(len(text) - n + 1)}
def jaccard_similarity(set1: Set[str], set2: Set[str]) -> float:
"""Compute Jaccard similarity between two sets."""
if not set1 or not set2:
return 0.0
intersection = len(set1 & set2)
union = len(set1 | set2)
return intersection / union if union > 0 else 0.0
AnyItem = Union[schema.RedditItem, schema.XItem, schema.YouTubeItem,
schema.HackerNewsItem, schema.PolymarketItem, schema.WebSearchItem]
def get_item_text(item: AnyItem) -> str:
"""Get comparable text from an item."""
if isinstance(item, schema.RedditItem):
return item.title
elif isinstance(item, schema.HackerNewsItem):
return item.title
elif isinstance(item, schema.YouTubeItem):
return f"{item.title} {item.channel_name}"
elif isinstance(item, schema.PolymarketItem):
return f"{item.title} {item.question}"
elif isinstance(item, schema.WebSearchItem):
return item.title
else:
return item.text
def _get_cross_source_text(item: AnyItem) -> str:
"""Get text for cross-source comparison.
Same as get_item_text() but truncates X posts to 100 chars
to level the playing field against short Reddit/HN titles.
Strips 'Show HN:' prefix from HN titles for fairer matching.
"""
if isinstance(item, schema.XItem):
return item.text[:100]
if isinstance(item, schema.HackerNewsItem):
title = item.title
if title.startswith("Show HN:"):
title = title[8:].strip()
elif title.startswith("Ask HN:"):
title = title[7:].strip()
return title
if isinstance(item, schema.PolymarketItem):
return item.title
return get_item_text(item)
def _tokenize_for_xref(text: str) -> Set[str]:
"""Tokenize text for cross-source token Jaccard comparison."""
words = re.sub(r'[^\w\s]', ' ', text.lower()).split()
return {w for w in words if w not in STOPWORDS and len(w) > 1}
def _token_jaccard(text_a: str, text_b: str) -> float:
"""Token-level Jaccard similarity (word overlap)."""
tokens_a = _tokenize_for_xref(text_a)
tokens_b = _tokenize_for_xref(text_b)
if not tokens_a or not tokens_b:
return 0.0
intersection = len(tokens_a & tokens_b)
union = len(tokens_a | tokens_b)
return intersection / union if union else 0.0
def _hybrid_similarity(text_a: str, text_b: str) -> float:
"""Hybrid similarity: max of char-trigram Jaccard and token Jaccard."""
trigram_sim = jaccard_similarity(get_ngrams(text_a), get_ngrams(text_b))
token_sim = _token_jaccard(text_a, text_b)
return max(trigram_sim, token_sim)
def find_duplicates(
items: List[Union[schema.RedditItem, schema.XItem]],
threshold: float = 0.7,
) -> List[Tuple[int, int]]:
"""Find near-duplicate pairs in items.
Args:
items: List of items to check
threshold: Similarity threshold (0-1)
Returns:
List of (i, j) index pairs where i < j and items are similar
"""
duplicates = []
# Pre-compute n-grams
ngrams = [get_ngrams(get_item_text(item)) for item in items]
for i in range(len(items)):
for j in range(i + 1, len(items)):
similarity = jaccard_similarity(ngrams[i], ngrams[j])
if similarity >= threshold:
duplicates.append((i, j))
return duplicates
def dedupe_items(
items: List[Union[schema.RedditItem, schema.XItem]],
threshold: float = 0.7,
) -> List[Union[schema.RedditItem, schema.XItem]]:
"""Remove near-duplicates, keeping highest-scored item.
Args:
items: List of items (should be pre-sorted by score descending)
threshold: Similarity threshold
Returns:
Deduplicated items
"""
if len(items) <= 1:
return items
# Find duplicate pairs
dup_pairs = find_duplicates(items, threshold)
# Mark indices to remove (always remove the lower-scored one)
# Since items are pre-sorted by score, the second index is always lower
to_remove = set()
for i, j in dup_pairs:
# Keep the higher-scored one (lower index in sorted list)
if items[i].score >= items[j].score:
to_remove.add(j)
else:
to_remove.add(i)
# Return items not marked for removal
return [item for idx, item in enumerate(items) if idx not in to_remove]
def dedupe_reddit(
items: List[schema.RedditItem],
threshold: float = 0.7,
) -> List[schema.RedditItem]:
"""Dedupe Reddit items."""
return dedupe_items(items, threshold)
def dedupe_x(
items: List[schema.XItem],
threshold: float = 0.7,
) -> List[schema.XItem]:
"""Dedupe X items."""
return dedupe_items(items, threshold)
def dedupe_youtube(
items: List[schema.YouTubeItem],
threshold: float = 0.7,
) -> List[schema.YouTubeItem]:
"""Dedupe YouTube items."""
return dedupe_items(items, threshold)
def dedupe_hackernews(
items: List[schema.HackerNewsItem],
threshold: float = 0.7,
) -> List[schema.HackerNewsItem]:
"""Dedupe Hacker News items."""
return dedupe_items(items, threshold)
def dedupe_polymarket(
items: List[schema.PolymarketItem],
threshold: float = 0.7,
) -> List[schema.PolymarketItem]:
"""Dedupe Polymarket items."""
return dedupe_items(items, threshold)
def cross_source_link(
*source_lists: List[AnyItem],
threshold: float = 0.40,
) -> None:
"""Annotate items with cross-source references.
Compares items across different source types using hybrid similarity
(max of char-trigram Jaccard and token Jaccard). When similarity exceeds
threshold, adds bidirectional cross_refs with the related item's ID.
Modifies items in-place.
Args:
*source_lists: Variable number of per-source item lists
threshold: Similarity threshold for cross-linking (default 0.40)
"""
all_items = []
for source_list in source_lists:
all_items.extend(source_list)
if len(all_items) <= 1:
return
# Pre-compute cross-source text for each item
texts = [_get_cross_source_text(item) for item in all_items]
for i in range(len(all_items)):
for j in range(i + 1, len(all_items)):
# Skip same-source comparisons (handled by per-source dedupe)
if type(all_items[i]) is type(all_items[j]):
continue
similarity = _hybrid_similarity(texts[i], texts[j])
if similarity >= threshold:
# Bidirectional cross-reference
if all_items[j].id not in all_items[i].cross_refs:
all_items[i].cross_refs.append(all_items[j].id)
if all_items[i].id not in all_items[j].cross_refs:
all_items[j].cross_refs.append(all_items[i].id)

View File

@@ -0,0 +1,127 @@
"""Entity extraction from Phase 1 search results for supplemental searches."""
import re
from collections import Counter
from typing import Any, Dict, List
# Handles that appear too frequently to be useful for targeted search.
# These are generic/platform accounts, not topic-specific voices.
GENERIC_HANDLES = {
"elonmusk", "openai", "google", "microsoft", "apple", "meta",
"github", "youtube", "x", "twitter", "reddit", "wikipedia",
"nytimes", "washingtonpost", "cnn", "bbc", "reuters",
"verified", "jack", "sundarpichai",
}
def extract_entities(
reddit_items: List[Dict[str, Any]],
x_items: List[Dict[str, Any]],
max_handles: int = 5,
max_hashtags: int = 3,
max_subreddits: int = 5,
) -> Dict[str, List[str]]:
"""Extract key entities from Phase 1 results for supplemental searches.
Parses X results for @handles and #hashtags, Reddit results for subreddit
names and cross-referenced communities.
Args:
reddit_items: Raw Reddit item dicts from Phase 1
x_items: Raw X item dicts from Phase 1
max_handles: Maximum handles to return
max_hashtags: Maximum hashtags to return
max_subreddits: Maximum subreddits to return
Returns:
Dict with keys: x_handles, x_hashtags, reddit_subreddits
"""
handles = _extract_x_handles(x_items)
hashtags = _extract_x_hashtags(x_items)
subreddits = _extract_subreddits(reddit_items)
return {
"x_handles": handles[:max_handles],
"x_hashtags": hashtags[:max_hashtags],
"reddit_subreddits": subreddits[:max_subreddits],
}
def _extract_x_handles(x_items: List[Dict[str, Any]]) -> List[str]:
"""Extract and rank @handles from X results.
Sources handles from:
1. author_handle field (who posted)
2. @mentions in post text (who they're talking about/to)
Returns handles ranked by frequency, filtered for generic accounts.
"""
handle_counts = Counter()
for item in x_items:
# Author handle
author = item.get("author_handle", "").strip().lstrip("@").lower()
if author and author not in GENERIC_HANDLES:
handle_counts[author] += 1
# @mentions in text
text = item.get("text", "")
mentions = re.findall(r'@(\w{1,15})', text)
for mention in mentions:
mention_lower = mention.lower()
if mention_lower not in GENERIC_HANDLES:
handle_counts[mention_lower] += 1
# Return all handles ranked by frequency
return [h for h, _ in handle_counts.most_common()]
def _extract_x_hashtags(x_items: List[Dict[str, Any]]) -> List[str]:
"""Extract and rank #hashtags from X results.
Returns hashtags ranked by frequency.
"""
hashtag_counts = Counter()
for item in x_items:
text = item.get("text", "")
tags = re.findall(r'#(\w{2,30})', text)
for tag in tags:
hashtag_counts[tag.lower()] += 1
# Return all hashtags ranked by frequency
return [f"#{t}" for t, _ in hashtag_counts.most_common()]
def _extract_subreddits(reddit_items: List[Dict[str, Any]]) -> List[str]:
"""Extract and rank subreddits from Reddit results.
Sources from:
1. subreddit field on each result
2. Cross-references in comment text (e.g., "check out r/localLLaMA")
Returns subreddits ranked by frequency.
"""
sub_counts = Counter()
for item in reddit_items:
# Primary subreddit
sub = item.get("subreddit", "").strip().lstrip("r/")
if sub:
sub_counts[sub] += 1
# Cross-references in comment insights
for insight in item.get("comment_insights", []):
cross_refs = re.findall(r'r/(\w{2,30})', insight)
for ref in cross_refs:
sub_counts[ref] += 1
# Cross-references in top comments
for comment in item.get("top_comments", []):
excerpt = comment.get("excerpt", "")
cross_refs = re.findall(r'r/(\w{2,30})', excerpt)
for ref in cross_refs:
sub_counts[ref] += 1
# Return subreddits ranked by frequency
return [sub for sub, _ in sub_counts.most_common()]

View File

@@ -0,0 +1,627 @@
"""Environment and API key management for last30days skill."""
import base64
import json
import os
import time
from dataclasses import dataclass
from pathlib import Path
from typing import Optional, Dict, Any, Literal, Iterable, List
# Allow override via environment variable for testing
# Set LAST30DAYS_CONFIG_DIR="" for clean/no-config mode
# Set LAST30DAYS_CONFIG_DIR="/path/to/dir" for custom config location
_config_override = os.environ.get('LAST30DAYS_CONFIG_DIR')
if _config_override == "":
# Empty string = no config file (clean mode)
CONFIG_DIR = None
CONFIG_FILE = None
elif _config_override:
CONFIG_DIR = Path(_config_override)
CONFIG_FILE = CONFIG_DIR / ".env"
else:
CONFIG_DIR = Path.home() / ".config" / "last30days"
CONFIG_FILE = CONFIG_DIR / ".env"
CODEX_AUTH_FILE = Path(os.environ.get("CODEX_AUTH_FILE", str(Path.home() / ".codex" / "auth.json")))
AuthSource = Literal["api_key", "codex", "none"]
AuthStatus = Literal["ok", "missing", "expired", "missing_account_id"]
AUTH_SOURCE_API_KEY: AuthSource = "api_key"
AUTH_SOURCE_CODEX: AuthSource = "codex"
AUTH_SOURCE_NONE: AuthSource = "none"
AUTH_STATUS_OK: AuthStatus = "ok"
AUTH_STATUS_MISSING: AuthStatus = "missing"
AUTH_STATUS_EXPIRED: AuthStatus = "expired"
AUTH_STATUS_MISSING_ACCOUNT_ID: AuthStatus = "missing_account_id"
PLACEHOLDER_LITERALS = {
"...",
"sk-...",
"sk-or-...",
"xai-...",
"null",
"none",
"changeme",
"change_me",
"replace_me",
"replace-me",
"your_key",
"your-api-key",
"your_api_key",
}
PLACEHOLDER_PREFIXES = (
"your_",
"your-",
"replace_",
"replace-",
"paste_",
"paste-",
)
PLACEHOLDER_SUFFIXES = (
"_here",
"-here",
)
@dataclass(frozen=True)
class OpenAIAuth:
token: Optional[str]
source: AuthSource
status: AuthStatus
account_id: Optional[str]
codex_auth_file: str
def load_env_file(path: Path) -> Dict[str, str]:
"""Load environment variables from a file."""
env = {}
if not path.exists():
return env
with open(path, 'r') as f:
for line in f:
line = line.strip()
if not line or line.startswith('#'):
continue
if '=' in line:
key, _, value = line.partition('=')
key = key.strip()
value = value.strip()
# Remove quotes if present
if value and value[0] in ('"', "'") and value[-1] == value[0]:
value = value[1:-1]
if key and value:
env[key] = value
return env
def _deep_merge_dict(base: Dict[str, Any], override: Dict[str, Any]) -> Dict[str, Any]:
"""Recursively merge dictionaries."""
result: Dict[str, Any] = dict(base)
for key, value in override.items():
if (
key in result
and isinstance(result[key], dict)
and isinstance(value, dict)
):
result[key] = _deep_merge_dict(result[key], value)
else:
result[key] = value
return result
def _unique_paths(paths: Iterable[Path]) -> List[Path]:
"""Deduplicate paths while preserving order."""
seen = set()
result: List[Path] = []
for path in paths:
key = str(path)
if key in seen:
continue
seen.add(key)
result.append(path)
return result
def _settings_path_candidates() -> List[Path]:
"""Build candidate settings.json paths.
Priority:
1) Global app settings (data/settings/settings.json)
2) Local settings files (settings.json, .meta/settings.json)
3) Project settings when PROJECT_ID is available
"""
cwd = Path.cwd().resolve()
ancestors = [cwd] + list(cwd.parents)
# Keep search bounded; repo roots are usually within a few levels
ancestors = ancestors[:8]
project_id = (os.environ.get("PROJECT_ID") or "").strip()
global_paths: List[Path] = []
local_paths: List[Path] = []
project_paths: List[Path] = []
for base in ancestors:
global_paths.append(base / "data" / "settings" / "settings.json")
local_paths.append(base / "settings.json")
local_paths.append(base / ".meta" / "settings.json")
if project_id:
project_paths.append(base / "data" / "projects" / project_id / "settings.json")
project_paths.append(base / "data" / "projects" / project_id / ".meta" / "settings.json")
# Later files override earlier files
ordered = _unique_paths(global_paths + local_paths + project_paths)
return ordered
def _load_settings_overrides() -> Dict[str, Any]:
"""Load app/project settings.json and extract API key overrides."""
merged_settings: Dict[str, Any] = {}
for path in _settings_path_candidates():
if not path.exists():
continue
try:
with open(path, "r") as f:
data = json.load(f)
if isinstance(data, dict):
merged_settings = _deep_merge_dict(merged_settings, data)
except Exception:
continue
if not merged_settings:
return {}
overrides: Dict[str, Any] = {}
# search.apiKey (Tavily key used by app search provider)
search_cfg = merged_settings.get("search")
if isinstance(search_cfg, dict):
search_key = search_cfg.get("apiKey")
if isinstance(search_key, str):
overrides["TAVILY_API_KEY"] = search_key
# Provider model keys from app settings (fallbacks for skill runtime)
for model_key in ("chatModel", "utilityModel", "embeddingsModel"):
cfg = merged_settings.get(model_key)
if not isinstance(cfg, dict):
continue
provider = str(cfg.get("provider", "")).strip().lower()
api_key = cfg.get("apiKey")
if not isinstance(api_key, str):
continue
if provider == "openai" and "OPENAI_API_KEY" not in overrides:
overrides["OPENAI_API_KEY"] = api_key
elif provider == "openrouter" and "OPENROUTER_API_KEY" not in overrides:
overrides["OPENROUTER_API_KEY"] = api_key
return overrides
def _decode_jwt_payload(token: str) -> Optional[Dict[str, Any]]:
"""Decode JWT payload without verification."""
try:
parts = token.split(".")
if len(parts) < 2:
return None
payload_b64 = parts[1]
pad = "=" * (-len(payload_b64) % 4)
decoded = base64.urlsafe_b64decode(payload_b64 + pad)
return json.loads(decoded.decode("utf-8"))
except Exception:
return None
def _token_expired(token: str, leeway_seconds: int = 60) -> bool:
"""Check if JWT token is expired."""
payload = _decode_jwt_payload(token)
if not payload:
return False
exp = payload.get("exp")
if not exp:
return False
return exp <= (time.time() + leeway_seconds)
def extract_chatgpt_account_id(access_token: str) -> Optional[str]:
"""Extract chatgpt_account_id from JWT token."""
payload = _decode_jwt_payload(access_token)
if not payload:
return None
auth_claim = payload.get("https://api.openai.com/auth", {})
if isinstance(auth_claim, dict):
return auth_claim.get("chatgpt_account_id")
return None
def load_codex_auth(path: Path = CODEX_AUTH_FILE) -> Dict[str, Any]:
"""Load Codex auth JSON."""
if not path.exists():
return {}
try:
with open(path, "r") as f:
return json.load(f)
except Exception:
return {}
def get_codex_access_token() -> tuple[Optional[str], str]:
"""Get Codex access token from auth.json.
Returns:
(token, status) where status is 'ok', 'missing', or 'expired'
"""
auth = load_codex_auth()
token = None
if isinstance(auth, dict):
tokens = auth.get("tokens") or {}
if isinstance(tokens, dict):
token = tokens.get("access_token")
if not token:
token = auth.get("access_token")
if not token:
return None, AUTH_STATUS_MISSING
if _token_expired(token):
return None, AUTH_STATUS_EXPIRED
return token, AUTH_STATUS_OK
def get_openai_auth(file_env: Dict[str, str], settings_overrides: Optional[Dict[str, Any]] = None) -> OpenAIAuth:
"""Resolve OpenAI auth from API key or Codex login."""
settings_overrides = settings_overrides or {}
api_key = _resolve_secret(
os.environ.get('OPENAI_API_KEY'),
file_env.get('OPENAI_API_KEY'),
settings_overrides.get('OPENAI_API_KEY'),
)
if api_key:
return OpenAIAuth(
token=api_key,
source=AUTH_SOURCE_API_KEY,
status=AUTH_STATUS_OK,
account_id=None,
codex_auth_file=str(CODEX_AUTH_FILE),
)
codex_token, codex_status = get_codex_access_token()
if codex_token:
account_id = extract_chatgpt_account_id(codex_token)
if account_id:
return OpenAIAuth(
token=codex_token,
source=AUTH_SOURCE_CODEX,
status=AUTH_STATUS_OK,
account_id=account_id,
codex_auth_file=str(CODEX_AUTH_FILE),
)
return OpenAIAuth(
token=None,
source=AUTH_SOURCE_CODEX,
status=AUTH_STATUS_MISSING_ACCOUNT_ID,
account_id=None,
codex_auth_file=str(CODEX_AUTH_FILE),
)
return OpenAIAuth(
token=None,
source=AUTH_SOURCE_NONE,
status=codex_status,
account_id=None,
codex_auth_file=str(CODEX_AUTH_FILE),
)
def get_config() -> Dict[str, Any]:
"""Load configuration from ~/.config/last30days/.env and environment."""
# Load from config file first (if configured)
file_env = load_env_file(CONFIG_FILE) if CONFIG_FILE else {}
settings_overrides = _load_settings_overrides()
openai_auth = get_openai_auth(file_env, settings_overrides=settings_overrides)
# Build config: Codex/OpenAI auth + process.env > .env file
config = {
'OPENAI_API_KEY': openai_auth.token,
'OPENAI_AUTH_SOURCE': openai_auth.source,
'OPENAI_AUTH_STATUS': openai_auth.status,
'OPENAI_CHATGPT_ACCOUNT_ID': openai_auth.account_id,
'CODEX_AUTH_FILE': openai_auth.codex_auth_file,
}
secret_keys = (
'XAI_API_KEY',
'OPENROUTER_API_KEY',
'PARALLEL_API_KEY',
'BRAVE_API_KEY',
'TAVILY_API_KEY',
'AUTH_TOKEN',
'CT0',
)
for key in secret_keys:
config[key] = _resolve_secret(
os.environ.get(key),
file_env.get(key),
settings_overrides.get(key),
)
plain_keys = [
('OPENAI_MODEL_POLICY', 'auto'),
('OPENAI_MODEL_PIN', None),
('XAI_MODEL_POLICY', 'latest'),
('XAI_MODEL_PIN', None),
]
for key, default in plain_keys:
config[key] = os.environ.get(key) or file_env.get(key, default)
return config
def config_exists() -> bool:
"""Check if configuration file exists."""
return bool(CONFIG_FILE and CONFIG_FILE.exists())
def _normalize_secret(value: Optional[str]) -> Optional[str]:
"""Normalize secret value and ignore obvious placeholders."""
if value is None:
return None
v = value.strip()
if not v:
return None
if len(v) >= 2 and v[0] in ('"', "'") and v[-1] == v[0]:
v = v[1:-1].strip()
if not v:
return None
lower = v.lower()
if lower in PLACEHOLDER_LITERALS:
return None
if "****" in v:
return None
if "..." in v:
return None
if lower.startswith(PLACEHOLDER_PREFIXES):
return None
if lower.endswith(PLACEHOLDER_SUFFIXES):
return None
return v
def _resolve_secret(*values: Optional[str]) -> Optional[str]:
"""Return the first non-placeholder secret value from a list."""
for value in values:
secret = _normalize_secret(value)
if secret:
return secret
return None
def get_available_sources(config: Dict[str, Any]) -> str:
"""Determine which sources are available based on API keys.
Returns: 'all', 'both', 'reddit', 'reddit-web', 'x', 'x-web', 'web', or 'none'
"""
has_openai = bool(config.get('OPENAI_API_KEY')) and config.get('OPENAI_AUTH_STATUS') == AUTH_STATUS_OK
has_xai = bool(config.get('XAI_API_KEY'))
has_web = has_web_search_keys(config)
if has_openai and has_xai:
return 'all' if has_web else 'both'
elif has_openai:
return 'reddit-web' if has_web else 'reddit'
elif has_xai:
return 'x-web' if has_web else 'x'
elif has_web:
return 'web'
else:
return 'web' # Fallback: assistant WebSearch (no API keys needed)
def has_web_search_keys(config: Dict[str, Any]) -> bool:
"""Check if any web search API keys are configured."""
return bool(
config.get('OPENROUTER_API_KEY')
or config.get('PARALLEL_API_KEY')
or config.get('BRAVE_API_KEY')
or config.get('TAVILY_API_KEY')
)
def get_web_search_source(config: Dict[str, Any]) -> Optional[str]:
"""Determine the best available web search backend.
Priority: Parallel AI > Brave > Tavily > OpenRouter/Sonar Pro
Returns: 'parallel', 'brave', 'openrouter', 'tavily', or None
"""
if config.get('PARALLEL_API_KEY'):
return 'parallel'
if config.get('BRAVE_API_KEY'):
return 'brave'
if config.get('TAVILY_API_KEY'):
return 'tavily'
if config.get('OPENROUTER_API_KEY'):
return 'openrouter'
return None
def get_missing_keys(config: Dict[str, Any]) -> str:
"""Determine which sources are missing (accounting for Bird).
Returns: 'all', 'both', 'reddit', 'x', 'web', or 'none'
"""
has_openai = bool(config.get('OPENAI_API_KEY')) and config.get('OPENAI_AUTH_STATUS') == AUTH_STATUS_OK
has_xai = bool(config.get('XAI_API_KEY'))
has_web = has_web_search_keys(config)
# Check if Bird provides X access (import here to avoid circular dependency)
from . import bird_x
has_bird = bird_x.is_bird_installed() and bird_x.is_bird_authenticated()
has_x = has_xai or has_bird
if has_openai and has_x and has_web:
return 'none'
elif has_openai and has_x:
return 'web' # Missing web search keys
elif has_openai:
return 'x' # Missing X source (and possibly web)
elif has_x:
return 'reddit' # Missing OpenAI key (and possibly web)
else:
return 'all' # Missing everything
def validate_sources(requested: str, available: str, include_web: bool = False) -> tuple[str, Optional[str]]:
"""Validate requested sources against available keys.
Args:
requested: 'auto', 'reddit', 'x', 'both', or 'web'
available: Result from get_available_sources()
include_web: If True, add WebSearch to available sources
Returns:
Tuple of (effective_sources, error_message)
"""
# No API keys at all
if available == 'none':
if requested == 'auto':
return 'web', "No API keys configured. The assistant can still search the web if it has a search tool."
elif requested == 'web':
return 'web', None
else:
return 'web', f"No API keys configured. Add keys to ~/.config/last30days/.env for Reddit/X."
# Web-only mode (only web search API keys)
if available == 'web':
if requested == 'auto':
return 'web', None
elif requested == 'web':
return 'web', None
else:
return 'web', "Only web search keys configured. Add OPENAI_API_KEY (or run codex login) for Reddit, XAI_API_KEY for X."
if requested == 'auto':
# Add web to sources if include_web is set
if include_web:
if available == 'both':
return 'all', None # reddit + x + web
elif available == 'reddit':
return 'reddit-web', None
elif available == 'x':
return 'x-web', None
return available, None
if requested == 'web':
return 'web', None
if requested == 'both':
if available not in ('both',):
missing = 'xAI' if available == 'reddit' else 'OpenAI'
return 'none', f"Requested both sources but {missing} key is missing. Use --sources=auto to use available keys."
if include_web:
return 'all', None
return 'both', None
if requested == 'reddit':
if available == 'x':
return 'none', "Requested Reddit but only xAI key is available."
if include_web:
return 'reddit-web', None
return 'reddit', None
if requested == 'x':
if available == 'reddit':
return 'none', "Requested X but only OpenAI key is available."
if include_web:
return 'x-web', None
return 'x', None
return requested, None
def get_x_source(config: Dict[str, Any]) -> Optional[str]:
"""Determine the best available X/Twitter source.
Priority: Bird (free) → xAI (paid API)
Args:
config: Configuration dict from get_config()
Returns:
'bird' if Bird is installed and authenticated,
'xai' if XAI_API_KEY is configured,
None if no X source available.
"""
# Import here to avoid circular dependency
from . import bird_x
# Check Bird first (free option)
if bird_x.is_bird_installed():
username = bird_x.is_bird_authenticated()
if username:
return 'bird'
# Fall back to xAI if key exists
if config.get('XAI_API_KEY'):
return 'xai'
return None
def is_ytdlp_available() -> bool:
"""Check if yt-dlp is installed for YouTube search."""
from . import youtube_yt
return youtube_yt.is_ytdlp_installed()
def is_hackernews_available() -> bool:
"""Check if Hacker News source is available.
Always returns True - HN uses free Algolia API, no key needed.
"""
return True
def is_polymarket_available() -> bool:
"""Check if Polymarket source is available.
Always returns True - Gamma API is free, no key needed.
"""
return True
def get_x_source_status(config: Dict[str, Any]) -> Dict[str, Any]:
"""Get detailed X source status for UI decisions.
Returns:
Dict with keys: source, bird_installed, bird_authenticated,
bird_username, xai_available, can_install_bird
"""
from . import bird_x
bird_status = bird_x.get_bird_status()
xai_available = bool(config.get('XAI_API_KEY'))
# Determine active source
if bird_status["authenticated"]:
source = 'bird'
elif xai_available:
source = 'xai'
else:
source = None
return {
"source": source,
"bird_installed": bird_status["installed"],
"bird_authenticated": bird_status["authenticated"],
"bird_username": bird_status["username"],
"xai_available": xai_available,
"can_install_bird": bird_status["can_install"],
}

View File

@@ -0,0 +1,253 @@
"""Hacker News search via Algolia API (free, no auth required).
Uses hn.algolia.com/api/v1 for story discovery and comment enrichment.
No API key needed - just HTTP calls via stdlib urllib.
"""
import html
import math
import sys
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import Any, Dict, List, Optional
from . import http
ALGOLIA_SEARCH_URL = "https://hn.algolia.com/api/v1/search"
ALGOLIA_SEARCH_BY_DATE_URL = "https://hn.algolia.com/api/v1/search_by_date"
ALGOLIA_ITEM_URL = "https://hn.algolia.com/api/v1/items"
DEPTH_CONFIG = {
"quick": 15,
"default": 30,
"deep": 60,
}
ENRICH_LIMITS = {
"quick": 3,
"default": 5,
"deep": 10,
}
def _log(msg: str):
"""Log to stderr (only in TTY mode to avoid cluttering Claude Code output)."""
if sys.stderr.isatty():
sys.stderr.write(f"[HN] {msg}\n")
sys.stderr.flush()
def _date_to_unix(date_str: str) -> int:
"""Convert YYYY-MM-DD to Unix timestamp (start of day UTC)."""
parts = date_str.split("-")
year, month, day = int(parts[0]), int(parts[1]), int(parts[2])
import calendar
import datetime
dt = datetime.datetime(year, month, day, tzinfo=datetime.timezone.utc)
return int(dt.timestamp())
def _unix_to_date(ts: int) -> str:
"""Convert Unix timestamp to YYYY-MM-DD."""
import datetime
dt = datetime.datetime.fromtimestamp(ts, tz=datetime.timezone.utc)
return dt.strftime("%Y-%m-%d")
def _strip_html(text: str) -> str:
"""Strip HTML tags and decode entities from HN comment text."""
import re
text = html.unescape(text)
text = re.sub(r'<p>', '\n', text)
text = re.sub(r'<[^>]+>', '', text)
return text.strip()
def search_hackernews(
topic: str,
from_date: str,
to_date: str,
depth: str = "default",
) -> Dict[str, Any]:
"""Search Hacker News via Algolia API.
Args:
topic: Search topic
from_date: Start date (YYYY-MM-DD)
to_date: End date (YYYY-MM-DD)
depth: 'quick', 'default', or 'deep'
Returns:
Dict with Algolia response (contains 'hits' list).
"""
count = DEPTH_CONFIG.get(depth, DEPTH_CONFIG["default"])
from_ts = _date_to_unix(from_date)
to_ts = _date_to_unix(to_date) + 86400 # Include the end date
_log(f"Searching for '{topic}' (since {from_date}, count={count})")
# Use relevance-sorted search (better for topic matching)
params = {
"query": topic,
"tags": "story",
"numericFilters": f"created_at_i>{from_ts},created_at_i<{to_ts}",
"hitsPerPage": str(count),
}
from urllib.parse import urlencode
url = f"{ALGOLIA_SEARCH_URL}?{urlencode(params)}"
try:
response = http.request("GET", url, timeout=30)
except http.HTTPError as e:
_log(f"Search failed: {e}")
return {"hits": [], "error": str(e)}
except Exception as e:
_log(f"Search failed: {e}")
return {"hits": [], "error": str(e)}
hits = response.get("hits", [])
_log(f"Found {len(hits)} stories")
return response
def parse_hackernews_response(response: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Parse Algolia response into normalized item dicts.
Returns:
List of item dicts ready for normalization.
"""
hits = response.get("hits", [])
items = []
for i, hit in enumerate(hits):
object_id = hit.get("objectID", "")
points = hit.get("points") or 0
num_comments = hit.get("num_comments") or 0
created_at_i = hit.get("created_at_i")
date_str = None
if created_at_i:
date_str = _unix_to_date(created_at_i)
# Article URL vs HN discussion URL
article_url = hit.get("url") or ""
hn_url = f"https://news.ycombinator.com/item?id={object_id}"
# Relevance: Algolia rank position gives a base, engagement boosts it
# Position 0 = most relevant from Algolia
rank_score = max(0.3, 1.0 - (i * 0.02)) # 1.0 -> 0.3 over 35 items
engagement_boost = min(0.2, math.log1p(points) / 40)
relevance = min(1.0, rank_score * 0.7 + engagement_boost + 0.1)
items.append({
"object_id": object_id,
"title": hit.get("title", ""),
"url": article_url,
"hn_url": hn_url,
"author": hit.get("author", ""),
"date": date_str,
"engagement": {
"points": points,
"num_comments": num_comments,
},
"relevance": round(relevance, 2),
"why_relevant": f"HN story about {hit.get('title', 'topic')[:60]}",
})
return items
def _fetch_item_comments(object_id: str, max_comments: int = 5) -> Dict[str, Any]:
"""Fetch top-level comments for a story from Algolia items endpoint.
Args:
object_id: HN story ID
max_comments: Max comments to return
Returns:
Dict with 'comments' list and 'comment_insights' list.
"""
url = f"{ALGOLIA_ITEM_URL}/{object_id}"
try:
data = http.request("GET", url, timeout=15)
except Exception as e:
_log(f"Failed to fetch comments for {object_id}: {e}")
return {"comments": [], "comment_insights": []}
children = data.get("children", [])
# Sort by points (highest first), filter to actual comments
real_comments = [
c for c in children
if c.get("text") and c.get("author")
]
real_comments.sort(key=lambda c: c.get("points") or 0, reverse=True)
comments = []
insights = []
for c in real_comments[:max_comments]:
text = _strip_html(c.get("text", ""))
excerpt = text[:300] + "..." if len(text) > 300 else text
comments.append({
"author": c.get("author", ""),
"text": excerpt,
"points": c.get("points") or 0,
})
# First sentence as insight
first_sentence = text.split(". ")[0].split("\n")[0][:200]
if first_sentence:
insights.append(first_sentence)
return {"comments": comments, "comment_insights": insights}
def enrich_top_stories(
items: List[Dict[str, Any]],
depth: str = "default",
) -> List[Dict[str, Any]]:
"""Fetch comments for top N stories by points.
Args:
items: Parsed HN items
depth: Research depth (controls how many to enrich)
Returns:
Items with top_comments and comment_insights added.
"""
if not items:
return items
limit = ENRICH_LIMITS.get(depth, ENRICH_LIMITS["default"])
# Sort by points to enrich the most popular stories
by_points = sorted(
range(len(items)),
key=lambda i: items[i].get("engagement", {}).get("points", 0),
reverse=True,
)
to_enrich = by_points[:limit]
_log(f"Enriching top {len(to_enrich)} stories with comments")
with ThreadPoolExecutor(max_workers=5) as executor:
futures = {
executor.submit(
_fetch_item_comments,
items[idx]["object_id"],
): idx
for idx in to_enrich
}
for future in as_completed(futures):
idx = futures[future]
try:
result = future.result(timeout=15)
items[idx]["top_comments"] = result["comments"]
items[idx]["comment_insights"] = result["comment_insights"]
except Exception:
items[idx]["top_comments"] = []
items[idx]["comment_insights"] = []
return items

View File

@@ -0,0 +1,174 @@
"""HTTP utilities for last30days skill (stdlib only)."""
import json
import os
import sys
import time
import urllib.error
import urllib.request
from typing import Any, Dict, Optional
from urllib.parse import urlencode
DEFAULT_TIMEOUT = 30
DEBUG = os.environ.get("LAST30DAYS_DEBUG", "").lower() in ("1", "true", "yes")
def log(msg: str):
"""Log debug message to stderr."""
if DEBUG:
sys.stderr.write(f"[DEBUG] {msg}\n")
sys.stderr.flush()
MAX_RETRIES = 5
RETRY_DELAY = 2.0
USER_AGENT = "last30days-skill/2.1 (Assistant Skill)"
class HTTPError(Exception):
"""HTTP request error with status code."""
def __init__(self, message: str, status_code: Optional[int] = None, body: Optional[str] = None):
super().__init__(message)
self.status_code = status_code
self.body = body
def request(
method: str,
url: str,
headers: Optional[Dict[str, str]] = None,
json_data: Optional[Dict[str, Any]] = None,
timeout: int = DEFAULT_TIMEOUT,
retries: int = MAX_RETRIES,
raw: bool = False,
) -> Dict[str, Any]:
"""Make an HTTP request and return JSON response.
Args:
method: HTTP method (GET, POST, etc.)
url: Request URL
headers: Optional headers dict
json_data: Optional JSON body (for POST)
timeout: Request timeout in seconds
retries: Number of retries on failure
Returns:
Parsed JSON response (or raw text if raw=True)
Raises:
HTTPError: On request failure
"""
headers = headers or {}
headers.setdefault("User-Agent", USER_AGENT)
data = None
if json_data is not None:
data = json.dumps(json_data).encode('utf-8')
headers.setdefault("Content-Type", "application/json")
req = urllib.request.Request(url, data=data, headers=headers, method=method)
log(f"{method} {url}")
last_error = None
for attempt in range(retries):
try:
with urllib.request.urlopen(req, timeout=timeout) as response:
body = response.read().decode('utf-8')
log(f"Response: {response.status} ({len(body)} bytes)")
if raw:
return body
return json.loads(body) if body else {}
except urllib.error.HTTPError as e:
body = None
try:
body = e.read().decode('utf-8')
except:
pass
log(f"HTTP Error {e.code}: {e.reason}")
if body:
snippet = " ".join(body.split())
log(f"Error body: {snippet[:200]}")
last_error = HTTPError(f"HTTP {e.code}: {e.reason}", e.code, body)
# Don't retry client errors (4xx) except rate limits
if 400 <= e.code < 500 and e.code != 429:
raise last_error
if attempt < retries - 1:
if e.code == 429:
# Respect Retry-After header, fall back to exponential backoff
retry_after = e.headers.get("Retry-After") if hasattr(e, 'headers') else None
if retry_after:
try:
delay = float(retry_after)
except ValueError:
delay = RETRY_DELAY * (2 ** attempt) + 1
else:
delay = RETRY_DELAY * (2 ** attempt) + 1 # 2s, 5s, 9s...
log(f"Rate limited (429). Waiting {delay:.1f}s before retry {attempt + 2}/{retries}")
else:
delay = RETRY_DELAY * (2 ** attempt)
time.sleep(delay)
except urllib.error.URLError as e:
log(f"URL Error: {e.reason}")
last_error = HTTPError(f"URL Error: {e.reason}")
if attempt < retries - 1:
time.sleep(RETRY_DELAY * (attempt + 1))
except json.JSONDecodeError as e:
log(f"JSON decode error: {e}")
last_error = HTTPError(f"Invalid JSON response: {e}")
raise last_error
except (OSError, TimeoutError, ConnectionResetError) as e:
# Handle socket-level errors (connection reset, timeout, etc.)
log(f"Connection error: {type(e).__name__}: {e}")
last_error = HTTPError(f"Connection error: {type(e).__name__}: {e}")
if attempt < retries - 1:
time.sleep(RETRY_DELAY * (attempt + 1))
if last_error:
raise last_error
raise HTTPError("Request failed with no error details")
def get(url: str, headers: Optional[Dict[str, str]] = None, **kwargs) -> Dict[str, Any]:
"""Make a GET request."""
return request("GET", url, headers=headers, **kwargs)
def post(url: str, json_data: Dict[str, Any], headers: Optional[Dict[str, str]] = None, **kwargs) -> Dict[str, Any]:
"""Make a POST request with JSON body."""
return request("POST", url, headers=headers, json_data=json_data, **kwargs)
def post_raw(url: str, json_data: Dict[str, Any], headers: Optional[Dict[str, str]] = None, **kwargs) -> str:
"""Make a POST request with JSON body and return raw text."""
return request("POST", url, headers=headers, json_data=json_data, raw=True, **kwargs)
def get_reddit_json(path: str, timeout: int = DEFAULT_TIMEOUT, retries: int = MAX_RETRIES) -> Dict[str, Any]:
"""Fetch Reddit thread JSON.
Args:
path: Reddit path (e.g., /r/subreddit/comments/id/title)
timeout: HTTP timeout per attempt in seconds
retries: Number of retries on failure
Returns:
Parsed JSON response
"""
# Ensure path starts with /
if not path.startswith('/'):
path = '/' + path
# Remove trailing slash and add .json
path = path.rstrip('/')
if not path.endswith('.json'):
path = path + '.json'
url = f"https://www.reddit.com{path}?raw_json=1"
headers = {
"User-Agent": USER_AGENT,
"Accept": "application/json",
}
return get(url, headers=headers, timeout=timeout, retries=retries)

View File

@@ -0,0 +1,185 @@
"""Model auto-selection for last30days skill."""
import re
from typing import Dict, List, Optional, Tuple
from . import cache, http, env
# OpenAI API
OPENAI_MODELS_URL = "https://api.openai.com/v1/models"
OPENAI_FALLBACK_MODELS = ["gpt-5.2", "gpt-5.1", "gpt-5", "gpt-4.1", "gpt-4o"]
CODEX_FALLBACK_MODELS = ["gpt-5.1-codex-mini", "gpt-5.2"]
# xAI API - Agent Tools API requires grok-4 family
XAI_MODELS_URL = "https://api.x.ai/v1/models"
XAI_ALIASES = {
"latest": "grok-4-1-fast", # Required for x_search tool
"stable": "grok-4-1-fast",
}
def parse_version(model_id: str) -> Optional[Tuple[int, ...]]:
"""Parse semantic version from model ID.
Examples:
gpt-5 -> (5,)
gpt-5.2 -> (5, 2)
gpt-5.2.1 -> (5, 2, 1)
"""
match = re.search(r'(\d+(?:\.\d+)*)', model_id)
if match:
return tuple(int(x) for x in match.group(1).split('.'))
return None
def is_mainline_openai_model(model_id: str) -> bool:
"""Check if model is a mainline GPT model (not mini/nano/chat/codex/pro)."""
model_lower = model_id.lower()
# Must be gpt-4o, gpt-4.1+, or gpt-5+ series (mainline, not mini/nano/etc)
if not re.match(r'^gpt-(?:4o|4\.1|5)(\.\d+)*$', model_lower):
return False
# Exclude variants
excludes = ['mini', 'nano', 'chat', 'codex', 'pro', 'preview', 'turbo']
for exc in excludes:
if exc in model_lower:
return False
return True
def select_openai_model(
api_key: str,
policy: str = "auto",
pin: Optional[str] = None,
mock_models: Optional[List[Dict]] = None,
) -> str:
"""Select the best OpenAI model based on policy.
Args:
api_key: OpenAI API key
policy: 'auto' or 'pinned'
pin: Model to use if policy is 'pinned'
mock_models: Mock model list for testing
Returns:
Selected model ID
"""
if policy == "pinned" and pin:
return pin
# Check cache first
cached = cache.get_cached_model("openai")
if cached:
return cached
# Fetch model list
if mock_models is not None:
models = mock_models
else:
try:
headers = {"Authorization": f"Bearer {api_key}"}
response = http.get(OPENAI_MODELS_URL, headers=headers)
models = response.get("data", [])
except http.HTTPError:
# Fall back to known models
return OPENAI_FALLBACK_MODELS[0]
# Filter to mainline models
candidates = [m for m in models if is_mainline_openai_model(m.get("id", ""))]
if not candidates:
# No gpt-5 models found, use fallback
return OPENAI_FALLBACK_MODELS[0]
# Sort by version (descending), then by created timestamp
def sort_key(m):
version = parse_version(m.get("id", "")) or (0,)
created = m.get("created", 0)
return (version, created)
candidates.sort(key=sort_key, reverse=True)
selected = candidates[0]["id"]
# Cache the selection
cache.set_cached_model("openai", selected)
return selected
def select_xai_model(
api_key: str,
policy: str = "latest",
pin: Optional[str] = None,
mock_models: Optional[List[Dict]] = None,
) -> str:
"""Select the best xAI model based on policy.
Args:
api_key: xAI API key
policy: 'latest', 'stable', or 'pinned'
pin: Model to use if policy is 'pinned'
mock_models: Mock model list for testing
Returns:
Selected model ID
"""
if policy == "pinned" and pin:
return pin
# Use alias system
if policy in XAI_ALIASES:
alias = XAI_ALIASES[policy]
# Check cache first
cached = cache.get_cached_model("xai")
if cached:
return cached
# Cache the alias
cache.set_cached_model("xai", alias)
return alias
# Default to latest
return XAI_ALIASES["latest"]
def get_models(
config: Dict,
mock_openai_models: Optional[List[Dict]] = None,
mock_xai_models: Optional[List[Dict]] = None,
) -> Dict[str, Optional[str]]:
"""Get selected models for both providers.
Returns:
Dict with 'openai' and 'xai' keys
"""
result = {"openai": None, "xai": None}
if config.get("OPENAI_API_KEY"):
if config.get("OPENAI_AUTH_SOURCE") == env.AUTH_SOURCE_CODEX:
# Codex auth doesn't use the OpenAI models list endpoint
policy = config.get("OPENAI_MODEL_POLICY", "auto")
pin = config.get("OPENAI_MODEL_PIN")
if policy == "pinned" and pin:
result["openai"] = pin
else:
result["openai"] = CODEX_FALLBACK_MODELS[0]
else:
result["openai"] = select_openai_model(
config["OPENAI_API_KEY"],
config.get("OPENAI_MODEL_POLICY", "auto"),
config.get("OPENAI_MODEL_PIN"),
mock_openai_models,
)
if config.get("XAI_API_KEY"):
result["xai"] = select_xai_model(
config["XAI_API_KEY"],
config.get("XAI_MODEL_POLICY", "latest"),
config.get("XAI_MODEL_PIN"),
mock_xai_models,
)
return result

View File

@@ -0,0 +1,308 @@
"""Normalization of raw API data to canonical schema."""
from typing import Any, Dict, List, TypeVar, Union
from . import dates, schema
T = TypeVar("T", schema.RedditItem, schema.XItem, schema.WebSearchItem, schema.YouTubeItem, schema.HackerNewsItem, schema.PolymarketItem)
def filter_by_date_range(
items: List[T],
from_date: str,
to_date: str,
require_date: bool = False,
) -> List[T]:
"""Hard filter: Remove items outside the date range.
This is the safety net - even if the prompt lets old content through,
this filter will exclude it.
Args:
items: List of items to filter
from_date: Start date (YYYY-MM-DD) - exclude items before this
to_date: End date (YYYY-MM-DD) - exclude items after this
require_date: If True, also remove items with no date
Returns:
Filtered list with only items in range (or unknown dates if not required)
"""
result = []
for item in items:
if item.date is None:
if not require_date:
result.append(item) # Keep unknown dates (with scoring penalty)
continue
# Hard filter: if date is before from_date, exclude
if item.date < from_date:
continue # DROP - too old
# Hard filter: if date is after to_date, exclude (likely parsing error)
if item.date > to_date:
continue # DROP - future date
result.append(item)
return result
def normalize_reddit_items(
items: List[Dict[str, Any]],
from_date: str,
to_date: str,
) -> List[schema.RedditItem]:
"""Normalize raw Reddit items to schema.
Args:
items: Raw Reddit items from API
from_date: Start of date range
to_date: End of date range
Returns:
List of RedditItem objects
"""
normalized = []
for item in items:
# Parse engagement
engagement = None
eng_raw = item.get("engagement")
if isinstance(eng_raw, dict):
engagement = schema.Engagement(
score=eng_raw.get("score"),
num_comments=eng_raw.get("num_comments"),
upvote_ratio=eng_raw.get("upvote_ratio"),
)
# Parse comments
top_comments = []
for c in item.get("top_comments", []):
top_comments.append(schema.Comment(
score=c.get("score", 0),
date=c.get("date"),
author=c.get("author", ""),
excerpt=c.get("excerpt", ""),
url=c.get("url", ""),
))
# Determine date confidence
date_str = item.get("date")
date_confidence = dates.get_date_confidence(date_str, from_date, to_date)
normalized.append(schema.RedditItem(
id=item.get("id", ""),
title=item.get("title", ""),
url=item.get("url", ""),
subreddit=item.get("subreddit", ""),
date=date_str,
date_confidence=date_confidence,
engagement=engagement,
top_comments=top_comments,
comment_insights=item.get("comment_insights", []),
relevance=item.get("relevance", 0.5),
why_relevant=item.get("why_relevant", ""),
))
return normalized
def normalize_x_items(
items: List[Dict[str, Any]],
from_date: str,
to_date: str,
) -> List[schema.XItem]:
"""Normalize raw X items to schema.
Args:
items: Raw X items from API
from_date: Start of date range
to_date: End of date range
Returns:
List of XItem objects
"""
normalized = []
for item in items:
# Parse engagement
engagement = None
eng_raw = item.get("engagement")
if isinstance(eng_raw, dict):
engagement = schema.Engagement(
likes=eng_raw.get("likes"),
reposts=eng_raw.get("reposts"),
replies=eng_raw.get("replies"),
quotes=eng_raw.get("quotes"),
)
# Determine date confidence
date_str = item.get("date")
date_confidence = dates.get_date_confidence(date_str, from_date, to_date)
normalized.append(schema.XItem(
id=item.get("id", ""),
text=item.get("text", ""),
url=item.get("url", ""),
author_handle=item.get("author_handle", ""),
date=date_str,
date_confidence=date_confidence,
engagement=engagement,
relevance=item.get("relevance", 0.5),
why_relevant=item.get("why_relevant", ""),
))
return normalized
def normalize_youtube_items(
items: List[Dict[str, Any]],
from_date: str,
to_date: str,
) -> List[schema.YouTubeItem]:
"""Normalize raw YouTube items to schema.
Args:
items: Raw YouTube items from yt-dlp
from_date: Start of date range
to_date: End of date range
Returns:
List of YouTubeItem objects
"""
normalized = []
for item in items:
# Parse engagement
eng_raw = item.get("engagement") or {}
engagement = schema.Engagement(
views=eng_raw.get("views"),
likes=eng_raw.get("likes"),
num_comments=eng_raw.get("comments"),
)
# YouTube dates are reliable (always YYYY-MM-DD from yt-dlp)
date_str = item.get("date")
normalized.append(schema.YouTubeItem(
id=item.get("video_id", ""),
title=item.get("title", ""),
url=item.get("url", ""),
channel_name=item.get("channel_name", ""),
date=date_str,
date_confidence="high",
engagement=engagement,
transcript_snippet=item.get("transcript_snippet", ""),
relevance=item.get("relevance", 0.7),
why_relevant=item.get("why_relevant", ""),
))
return normalized
def normalize_hackernews_items(
items: List[Dict[str, Any]],
from_date: str,
to_date: str,
) -> List[schema.HackerNewsItem]:
"""Normalize raw Hacker News items to schema.
Args:
items: Raw HN items from Algolia API
from_date: Start of date range
to_date: End of date range
Returns:
List of HackerNewsItem objects
"""
normalized = []
for i, item in enumerate(items):
# Parse engagement
eng_raw = item.get("engagement") or {}
engagement = schema.Engagement(
score=eng_raw.get("points"),
num_comments=eng_raw.get("num_comments"),
)
# Parse comments (from enrichment)
top_comments = []
for c in item.get("top_comments", []):
top_comments.append(schema.Comment(
score=c.get("points", 0),
date=None,
author=c.get("author", ""),
excerpt=c.get("text", ""),
url="",
))
# HN dates are always high confidence (exact timestamps from Algolia)
date_str = item.get("date")
normalized.append(schema.HackerNewsItem(
id=f"HN{i+1}",
title=item.get("title", ""),
url=item.get("url", ""),
hn_url=item.get("hn_url", ""),
author=item.get("author", ""),
date=date_str,
date_confidence="high",
engagement=engagement,
top_comments=top_comments,
comment_insights=item.get("comment_insights", []),
relevance=item.get("relevance", 0.5),
why_relevant=item.get("why_relevant", ""),
))
return normalized
def normalize_polymarket_items(
items: List[Dict[str, Any]],
from_date: str,
to_date: str,
) -> List[schema.PolymarketItem]:
"""Normalize raw Polymarket items to schema.
Args:
items: Raw Polymarket items from Gamma API
from_date: Start of date range
to_date: End of date range
Returns:
List of PolymarketItem objects
"""
normalized = []
for i, item in enumerate(items):
# Prefer volume1mo (more stable) for engagement scoring, fall back to volume24hr
volume = item.get("volume1mo") or item.get("volume24hr", 0.0)
engagement = schema.Engagement(
volume=volume,
liquidity=item.get("liquidity", 0.0),
)
date_str = item.get("date")
normalized.append(schema.PolymarketItem(
id=f"PM{i+1}",
title=item.get("title", ""),
question=item.get("question", ""),
url=item.get("url", ""),
outcome_prices=item.get("outcome_prices", []),
outcomes_remaining=item.get("outcomes_remaining", 0),
price_movement=item.get("price_movement"),
date=date_str,
date_confidence="high",
engagement=engagement,
end_date=item.get("end_date"),
relevance=item.get("relevance", 0.5),
why_relevant=item.get("why_relevant", ""),
))
return normalized
def items_to_dicts(items: List) -> List[Dict[str, Any]]:
"""Convert schema items to dicts for JSON serialization."""
return [item.to_dict() for item in items]

View File

@@ -0,0 +1,531 @@
"""OpenAI Responses API client for Reddit discovery."""
import json
import re
import sys
from typing import Any, Dict, List, Optional
from . import http, env
# Fallback models when the selected model isn't accessible (e.g., org not verified for GPT-5)
# Note: gpt-4o-mini does NOT support web_search with filters param, so exclude it
MODEL_FALLBACK_ORDER = ["gpt-4.1", "gpt-4o"]
def _log_error(msg: str):
"""Log error to stderr."""
sys.stderr.write(f"[REDDIT ERROR] {msg}\n")
sys.stderr.flush()
def _log_info(msg: str):
"""Log info to stderr."""
sys.stderr.write(f"[REDDIT] {msg}\n")
sys.stderr.flush()
def _is_model_access_error(error: http.HTTPError) -> bool:
"""Check if error is due to model access/verification issues."""
if error.status_code not in (400, 403):
return False
if not error.body:
return False
body_lower = error.body.lower()
# Check for common access/verification error messages
return any(phrase in body_lower for phrase in [
"verified",
"organization must be",
"does not have access",
"not available",
"not found",
])
OPENAI_RESPONSES_URL = "https://api.openai.com/v1/responses"
CODEX_RESPONSES_URL = "https://chatgpt.com/backend-api/codex/responses"
CODEX_INSTRUCTIONS = (
"You are a research assistant for a skill that summarizes what people are "
"discussing in the last 30 days. Your goal is to find relevant Reddit threads "
"about the topic and return ONLY the required JSON. Be inclusive (return more "
"rather than fewer), but avoid irrelevant results. Prefer threads with discussion "
"and comments. If you can infer a date, include it; otherwise use null. "
"Do not include developers.reddit.com or business.reddit.com."
)
def _parse_sse_chunk(chunk: str) -> Optional[Dict[str, Any]]:
"""Parse a single SSE chunk into a JSON object."""
lines = chunk.split("\n")
data_lines = []
for line in lines:
if line.startswith("data:"):
data_lines.append(line[5:].strip())
if not data_lines:
return None
data = "\n".join(data_lines).strip()
if not data or data == "[DONE]":
return None
try:
return json.loads(data)
except json.JSONDecodeError:
return None
def _parse_sse_stream_raw(raw: str) -> List[Dict[str, Any]]:
"""Parse SSE stream from raw text and return JSON events."""
events: List[Dict[str, Any]] = []
buffer = ""
for chunk in raw.splitlines(keepends=True):
buffer += chunk
while "\n\n" in buffer:
event_chunk, buffer = buffer.split("\n\n", 1)
event = _parse_sse_chunk(event_chunk)
if event is not None:
events.append(event)
if buffer.strip():
event = _parse_sse_chunk(buffer)
if event is not None:
events.append(event)
return events
def _parse_codex_stream(raw: str) -> Dict[str, Any]:
"""Parse SSE stream from Codex responses into a response-like dict."""
events = _parse_sse_stream_raw(raw)
# Prefer explicit completed response payload if present
for evt in reversed(events):
if isinstance(evt, dict):
if evt.get("type") == "response.completed" and isinstance(evt.get("response"), dict):
return evt["response"]
if isinstance(evt.get("response"), dict):
return evt["response"]
# Fallback: reconstruct output text from deltas
output_text = ""
for evt in events:
if not isinstance(evt, dict):
continue
delta = evt.get("delta")
if isinstance(delta, str):
output_text += delta
continue
text = evt.get("text")
if isinstance(text, str):
output_text += text
if output_text:
return {
"output": [
{
"type": "message",
"content": [{"type": "output_text", "text": output_text}],
}
]
}
return {}
# Depth configurations: (min, max) threads to request
# Request MORE than needed since many get filtered by date
DEPTH_CONFIG = {
"quick": (15, 25),
"default": (30, 50),
"deep": (70, 100),
}
REDDIT_SEARCH_PROMPT = """Find Reddit discussion threads about: {topic}
STEP 1: EXTRACT THE CORE SUBJECT
Get the MAIN NOUN/PRODUCT/TOPIC:
- "best nano banana prompting practices""nano banana"
- "killer features of clawdbot""clawdbot"
- "top Claude Code skills""Claude Code"
DO NOT include "best", "top", "tips", "practices", "features" in your search.
STEP 2: SEARCH BROADLY
Search for the core subject:
1. "[core subject] site:reddit.com"
2. "reddit [core subject]"
3. "[core subject] reddit"
Return as many relevant threads as you find. We filter by date server-side.
STEP 3: INCLUDE ALL MATCHES
- Include ALL threads about the core subject
- Set date to "YYYY-MM-DD" if you can determine it, otherwise null
- We verify dates and filter old content server-side
- DO NOT pre-filter aggressively - include anything relevant
REQUIRED: URLs must contain "/r/" AND "/comments/"
REJECT: developers.reddit.com, business.reddit.com
Find {min_items}-{max_items} threads. Return MORE rather than fewer.
Return JSON:
{{
"items": [
{{
"title": "Thread title",
"url": "https://www.reddit.com/r/sub/comments/xyz/title/",
"subreddit": "subreddit_name",
"date": "YYYY-MM-DD or null",
"why_relevant": "Why relevant",
"relevance": 0.85
}}
]
}}"""
def _extract_core_subject(topic: str) -> str:
"""Extract core subject from verbose query for retry."""
noise = ['best', 'top', 'how to', 'tips for', 'practices', 'features',
'killer', 'guide', 'tutorial', 'recommendations', 'advice',
'prompting', 'using', 'for', 'with', 'the', 'of', 'in', 'on']
words = topic.lower().split()
result = [w for w in words if w not in noise]
return ' '.join(result[:3]) or topic # Keep max 3 words
def _build_subreddit_query(topic: str) -> str:
"""Build a subreddit-targeted search query for fallback.
When standard search returns few results, try searching for the
subreddit itself: 'r/kanye', 'r/howie', etc.
"""
core = _extract_core_subject(topic)
# Remove dots and special chars for subreddit name guess
sub_name = core.replace('.', '').replace(' ', '').lower()
return f"r/{sub_name} site:reddit.com"
def _build_payload(model: str, instructions_text: str, input_text: str, auth_source: str) -> Dict[str, Any]:
"""Build responses payload for OpenAI or Codex endpoints."""
payload = {
"model": model,
"store": False,
"tools": [
{
"type": "web_search",
"filters": {
"allowed_domains": ["reddit.com"]
}
}
],
"include": ["web_search_call.action.sources"],
"instructions": instructions_text,
"input": input_text,
}
if auth_source == env.AUTH_SOURCE_CODEX:
payload["input"] = [
{
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": input_text}],
}
]
payload["stream"] = True
return payload
def search_reddit(
api_key: str,
model: str,
topic: str,
from_date: str,
to_date: str,
depth: str = "default",
auth_source: str = "api_key",
account_id: Optional[str] = None,
mock_response: Optional[Dict] = None,
_retry: bool = False,
) -> Dict[str, Any]:
"""Search Reddit for relevant threads using OpenAI Responses API.
Args:
api_key: OpenAI API key
model: Model to use
topic: Search topic
from_date: Start date (YYYY-MM-DD) - only include threads after this
to_date: End date (YYYY-MM-DD) - only include threads before this
depth: Research depth - "quick", "default", or "deep"
mock_response: Mock response for testing
Returns:
Raw API response
"""
if mock_response is not None:
return mock_response
min_items, max_items = DEPTH_CONFIG.get(depth, DEPTH_CONFIG["default"])
if auth_source == env.AUTH_SOURCE_CODEX:
if not account_id:
raise ValueError("Missing chatgpt_account_id for Codex auth")
headers = {
"Authorization": f"Bearer {api_key}",
"chatgpt-account-id": account_id,
"OpenAI-Beta": "responses=experimental",
"originator": "pi",
"Content-Type": "application/json",
}
url = CODEX_RESPONSES_URL
else:
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
url = OPENAI_RESPONSES_URL
# Adjust timeout based on depth (generous for OpenAI web_search which can be slow)
timeout = 90 if depth == "quick" else 120 if depth == "default" else 180
# Build list of models to try: requested model first, then fallbacks
models_to_try = [model] + [m for m in MODEL_FALLBACK_ORDER if m != model]
# Note: allowed_domains accepts base domain, not subdomains
# We rely on prompt to filter out developers.reddit.com, etc.
input_text = REDDIT_SEARCH_PROMPT.format(
topic=topic,
from_date=from_date,
to_date=to_date,
min_items=min_items,
max_items=max_items,
)
if auth_source == env.AUTH_SOURCE_CODEX:
# Codex auth: try model with fallback chain
from . import models as models_mod
codex_models_to_try = [model] + [m for m in models_mod.CODEX_FALLBACK_MODELS if m != model]
instructions_text = CODEX_INSTRUCTIONS + "\n\n" + input_text
last_error = None
for current_model in codex_models_to_try:
try:
payload = _build_payload(current_model, instructions_text, topic, auth_source)
raw = http.post_raw(url, payload, headers=headers, timeout=timeout)
return _parse_codex_stream(raw or "")
except http.HTTPError as e:
last_error = e
if e.status_code == 400:
_log_info(f"Model {current_model} not supported on Codex, trying fallback...")
continue
raise
if last_error:
raise last_error
raise http.HTTPError("No Codex-compatible models available")
# Standard API key auth: try model fallback chain
last_error = None
for current_model in models_to_try:
payload = {
"model": current_model,
"tools": [
{
"type": "web_search",
"filters": {
"allowed_domains": ["reddit.com"]
}
}
],
"include": ["web_search_call.action.sources"],
"input": input_text,
}
try:
return http.post(url, payload, headers=headers, timeout=timeout)
except http.HTTPError as e:
last_error = e
if _is_model_access_error(e):
_log_info(f"Model {current_model} not accessible, trying fallback...")
continue
if e.status_code == 429:
_log_info(f"Rate limited on {current_model}, trying fallback model...")
continue
# Non-access error, don't retry with different model
raise
# All models failed with access errors
if last_error:
_log_error(f"All models failed. Last error: {last_error}")
raise last_error
raise http.HTTPError("No models available")
def search_subreddits(
subreddits: List[str],
topic: str,
from_date: str,
to_date: str,
count_per: int = 5,
) -> List[Dict[str, Any]]:
"""Search specific subreddits via Reddit's free JSON endpoint.
No API key needed. Uses reddit.com/r/{sub}/search/.json endpoint.
Used in Phase 2 supplemental search after entity extraction.
Args:
subreddits: List of subreddit names (without r/)
topic: Search topic
from_date: Start date (YYYY-MM-DD)
to_date: End date (YYYY-MM-DD)
count_per: Results to request per subreddit
Returns:
List of raw item dicts (same format as parse_reddit_response output).
"""
all_items = []
core = _extract_core_subject(topic)
for sub in subreddits:
sub = sub.lstrip("r/")
try:
url = f"https://www.reddit.com/r/{sub}/search/.json"
params = f"q={_url_encode(core)}&restrict_sr=on&sort=new&limit={count_per}&raw_json=1"
full_url = f"{url}?{params}"
headers = {
"User-Agent": http.USER_AGENT,
"Accept": "application/json",
}
data = http.get(full_url, headers=headers, timeout=15, retries=1)
# Reddit search returns {"data": {"children": [...]}}
children = data.get("data", {}).get("children", [])
for i, child in enumerate(children):
if child.get("kind") != "t3": # t3 = link/submission
continue
post = child.get("data", {})
permalink = post.get("permalink", "")
if not permalink:
continue
item = {
"id": f"RS{len(all_items)+1}",
"title": str(post.get("title", "")).strip(),
"url": f"https://www.reddit.com{permalink}",
"subreddit": str(post.get("subreddit", sub)).strip(),
"date": None,
"why_relevant": f"Found in r/{sub} supplemental search",
"relevance": 0.65, # Slightly lower default for supplemental
}
# Parse date from created_utc
created_utc = post.get("created_utc")
if created_utc:
from . import dates as dates_mod
item["date"] = dates_mod.timestamp_to_date(created_utc)
all_items.append(item)
except http.HTTPError as e:
_log_info(f"Subreddit search failed for r/{sub}: {e}")
if e.status_code == 429:
_log_info("Reddit rate-limited (429) — skipping remaining subreddits")
break
except Exception as e:
_log_info(f"Subreddit search error for r/{sub}: {e}")
return all_items
def _url_encode(text: str) -> str:
"""Simple URL encoding for query parameters."""
import urllib.parse
return urllib.parse.quote_plus(text)
def parse_reddit_response(response: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Parse OpenAI response to extract Reddit items.
Args:
response: Raw API response
Returns:
List of item dicts
"""
items = []
# Check for API errors first
if "error" in response and response["error"]:
error = response["error"]
err_msg = error.get("message", str(error)) if isinstance(error, dict) else str(error)
_log_error(f"OpenAI API error: {err_msg}")
if http.DEBUG:
_log_error(f"Full error response: {json.dumps(response, indent=2)[:1000]}")
return items
# Try to find the output text
output_text = ""
if "output" in response:
output = response["output"]
if isinstance(output, str):
output_text = output
elif isinstance(output, list):
for item in output:
if isinstance(item, dict):
if item.get("type") == "message":
content = item.get("content", [])
for c in content:
if isinstance(c, dict) and c.get("type") == "output_text":
output_text = c.get("text", "")
break
elif "text" in item:
output_text = item["text"]
elif isinstance(item, str):
output_text = item
if output_text:
break
# Also check for choices (older format)
if not output_text and "choices" in response:
for choice in response["choices"]:
if "message" in choice:
output_text = choice["message"].get("content", "")
break
if not output_text:
print(f"[REDDIT WARNING] No output text found in OpenAI response. Keys present: {list(response.keys())}", flush=True)
return items
# Extract JSON from the response
json_match = re.search(r'\{[\s\S]*"items"[\s\S]*\}', output_text)
if json_match:
try:
data = json.loads(json_match.group())
items = data.get("items", [])
except json.JSONDecodeError:
pass
# Validate and clean items
clean_items = []
for i, item in enumerate(items):
if not isinstance(item, dict):
continue
url = item.get("url", "")
if not url or "reddit.com" not in url:
continue
clean_item = {
"id": f"R{i+1}",
"title": str(item.get("title", "")).strip(),
"url": url,
"subreddit": str(item.get("subreddit", "")).strip().lstrip("r/"),
"date": item.get("date"),
"why_relevant": str(item.get("why_relevant", "")).strip(),
"relevance": min(1.0, max(0.0, float(item.get("relevance", 0.5)))),
}
# Validate date format
if clean_item["date"]:
if not re.match(r'^\d{4}-\d{2}-\d{2}$', str(clean_item["date"])):
clean_item["date"] = None
clean_items.append(clean_item)
return clean_items

View File

@@ -0,0 +1,216 @@
"""Perplexity Sonar Pro web search via OpenRouter for last30days skill.
Uses OpenRouter's chat completions API with Perplexity's Sonar Pro model,
which has built-in web search and returns citations with URLs, titles, and dates.
This is the recommended web search backend -- highest quality results.
API docs: https://openrouter.ai/docs/quickstart
Model: perplexity/sonar-pro
"""
import re
import sys
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse
from . import http
ENDPOINT = "https://openrouter.ai/api/v1/chat/completions"
MODEL = "perplexity/sonar-pro"
# Domains to exclude (handled by Reddit/X search)
EXCLUDED_DOMAINS = {
"reddit.com", "www.reddit.com", "old.reddit.com",
"twitter.com", "www.twitter.com", "x.com", "www.x.com",
}
def search_web(
topic: str,
from_date: str,
to_date: str,
api_key: str,
depth: str = "default",
) -> List[Dict[str, Any]]:
"""Search the web via Perplexity Sonar Pro on OpenRouter.
Args:
topic: Search topic
from_date: Start date (YYYY-MM-DD)
to_date: End date (YYYY-MM-DD)
api_key: OpenRouter API key
depth: 'quick', 'default', or 'deep'
Returns:
List of result dicts with keys: url, title, snippet, source_domain, date, relevance
Raises:
http.HTTPError: On API errors
"""
max_tokens = {"quick": 1024, "default": 2048, "deep": 4096}.get(depth, 2048)
prompt = (
f"Find recent blog posts, news articles, tutorials, and discussions "
f"about {topic} published between {from_date} and {to_date}. "
f"Exclude results from reddit.com, x.com, and twitter.com. "
f"For each result, provide the title, URL, publication date, "
f"and a brief summary of why it's relevant."
)
payload = {
"model": MODEL,
"messages": [{"role": "user", "content": prompt}],
"max_tokens": max_tokens,
}
sys.stderr.write(f"[Web] Searching Sonar Pro via OpenRouter for: {topic}\n")
sys.stderr.flush()
response = http.post(
ENDPOINT,
json_data=payload,
headers={
"Authorization": f"Bearer {api_key}",
"HTTP-Referer": "https://github.com/mvanhorn/last30days-openclaw",
"X-Title": "last30days",
},
timeout=30,
)
return _normalize_results(response)
def _normalize_results(response: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Convert Sonar Pro response to websearch item schema.
Sonar Pro returns:
- search_results: [{title, url, date}] -- structured source metadata
- citations: [url, ...] -- flat list of cited URLs
- choices[0].message.content -- the synthesized text with [N] references
We prefer search_results (richer metadata), fall back to citations.
"""
items = []
# Try search_results first (has title, url, date)
search_results = response.get("search_results", [])
if isinstance(search_results, list) and search_results:
items = _parse_search_results(search_results)
# Fall back to citations if no search_results
if not items:
citations = response.get("citations", [])
content = _get_content(response)
if isinstance(citations, list) and citations:
items = _parse_citations(citations, content)
sys.stderr.write(f"[Web] Sonar Pro: {len(items)} results\n")
sys.stderr.flush()
return items
def _parse_search_results(results: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Parse the search_results array from Sonar Pro."""
items = []
for i, result in enumerate(results):
if not isinstance(result, dict):
continue
url = result.get("url", "")
if not url:
continue
# Skip excluded domains
try:
domain = urlparse(url).netloc.lower()
if domain in EXCLUDED_DOMAINS:
continue
if domain.startswith("www."):
domain = domain[4:]
except Exception:
domain = ""
title = str(result.get("title", "")).strip()
if not title:
continue
# Sonar Pro provides dates in search_results
date = result.get("date")
date_confidence = "med" if date else "low"
items.append({
"id": f"W{i+1}",
"title": title[:200],
"url": url,
"source_domain": domain,
"snippet": str(result.get("snippet", result.get("description", ""))).strip()[:500],
"date": date,
"date_confidence": date_confidence,
"relevance": 0.7, # Sonar Pro results are generally high quality
"why_relevant": "",
})
return items
def _parse_citations(citations: List[str], content: str) -> List[Dict[str, Any]]:
"""Parse the flat citations array, enriching with content context."""
items = []
for i, url in enumerate(citations):
if not isinstance(url, str) or not url:
continue
# Skip excluded domains
try:
domain = urlparse(url).netloc.lower()
if domain in EXCLUDED_DOMAINS:
continue
if domain.startswith("www."):
domain = domain[4:]
except Exception:
domain = ""
# Try to extract title from content references like [1] Title...
title = _extract_title_for_citation(content, i + 1) or domain
items.append({
"id": f"W{i+1}",
"title": title[:200],
"url": url,
"source_domain": domain,
"snippet": "",
"date": None,
"date_confidence": "low",
"relevance": 0.6,
"why_relevant": "",
})
return items
def _get_content(response: Dict[str, Any]) -> str:
"""Extract the text content from the chat completion response."""
try:
return response["choices"][0]["message"]["content"]
except (KeyError, IndexError, TypeError):
return ""
def _extract_title_for_citation(content: str, index: int) -> Optional[str]:
"""Try to extract a title near a citation reference [N] in the content."""
if not content:
return None
# Look for patterns like [1] Title or [1](url) Title
pattern = rf'\[{index}\][)\s]*([^\[\n]{{5,80}})'
match = re.search(pattern, content)
if match:
title = match.group(1).strip().rstrip('.')
# Clean up markdown artifacts
title = re.sub(r'[*_`]', '', title)
return title if len(title) > 3 else None
return None

View File

@@ -0,0 +1,139 @@
"""Parallel AI web search for last30days skill.
Uses the Parallel AI Search API to find web content (blogs, docs, news, tutorials).
This is the preferred web search backend -- it returns LLM-optimized results
with extended excerpts ranked by relevance.
API docs: https://docs.parallel.ai/search-api/search-quickstart
"""
import json
import sys
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse
from . import http
ENDPOINT = "https://api.parallel.ai/v1beta/search"
# Domains to exclude (handled by Reddit/X search)
EXCLUDED_DOMAINS = {
"reddit.com", "www.reddit.com", "old.reddit.com",
"twitter.com", "www.twitter.com", "x.com", "www.x.com",
}
def search_web(
topic: str,
from_date: str,
to_date: str,
api_key: str,
depth: str = "default",
) -> List[Dict[str, Any]]:
"""Search the web via Parallel AI Search API.
Args:
topic: Search topic
from_date: Start date (YYYY-MM-DD)
to_date: End date (YYYY-MM-DD)
api_key: Parallel AI API key
depth: 'quick', 'default', or 'deep'
Returns:
List of result dicts with keys: url, title, snippet, source_domain, date, relevance
Raises:
http.HTTPError: On API errors
"""
max_results = {"quick": 8, "default": 15, "deep": 25}.get(depth, 15)
payload = {
"objective": (
f"Find recent blog posts, tutorials, news articles, and discussions "
f"about {topic} from {from_date} to {to_date}. "
f"Exclude reddit.com, x.com, and twitter.com."
),
"max_results": max_results,
"max_chars_per_result": 500,
}
sys.stderr.write(f"[Web] Searching Parallel AI for: {topic}\n")
sys.stderr.flush()
response = http.post(
ENDPOINT,
json_data=payload,
headers={
"Authorization": f"Bearer {api_key}",
"parallel-beta": "search-extract-2025-10-10",
},
timeout=30,
)
return _normalize_results(response)
def _normalize_results(response: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Convert Parallel AI response to websearch item schema.
Args:
response: Raw API response
Returns:
List of normalized result dicts
"""
items = []
# Handle different response shapes
results = response.get("results", [])
if not isinstance(results, list):
return items
for i, result in enumerate(results):
if not isinstance(result, dict):
continue
url = result.get("url", "")
if not url:
continue
# Skip excluded domains
try:
domain = urlparse(url).netloc.lower()
if domain in EXCLUDED_DOMAINS:
continue
# Clean domain for display
if domain.startswith("www."):
domain = domain[4:]
except Exception:
domain = ""
title = str(result.get("title", "")).strip()
snippet = str(result.get("excerpt", result.get("snippet", result.get("description", "")))).strip()
if not title and not snippet:
continue
# Extract relevance score if provided
relevance = result.get("relevance_score", result.get("relevance", 0.6))
try:
relevance = min(1.0, max(0.0, float(relevance)))
except (TypeError, ValueError):
relevance = 0.6
items.append({
"id": f"W{i+1}",
"title": title[:200],
"url": url,
"source_domain": domain,
"snippet": snippet[:500],
"date": result.get("published_date", result.get("date")),
"date_confidence": "med" if result.get("published_date") or result.get("date") else "low",
"relevance": relevance,
"why_relevant": str(result.get("summary", "")).strip()[:200],
})
sys.stderr.write(f"[Web] Parallel AI: {len(items)} results\n")
sys.stderr.flush()
return items

View File

@@ -0,0 +1,554 @@
"""Polymarket prediction market search via Gamma API (free, no auth required).
Uses gamma-api.polymarket.com for event/market discovery.
No API key needed - public read-only API with generous rate limits (350 req/10s).
"""
import json
import math
import re
import sys
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import Any, Dict, List, Optional
from urllib.parse import quote_plus, urlencode
from . import http
GAMMA_SEARCH_URL = "https://gamma-api.polymarket.com/public-search"
# Pages to fetch per query (API returns 5 events per page, limit param is a no-op)
DEPTH_CONFIG = {
"quick": 1,
"default": 3,
"deep": 4,
}
# Max events to return after merge + dedup + re-ranking
RESULT_CAP = {
"quick": 5,
"default": 15,
"deep": 25,
}
def _log(msg: str):
"""Log to stderr (only in TTY mode to avoid cluttering Claude Code output)."""
if sys.stderr.isatty():
sys.stderr.write(f"[PM] {msg}\n")
sys.stderr.flush()
def _extract_core_subject(topic: str) -> str:
"""Extract core subject from topic string.
Strips common prefixes like 'last 7 days', 'what are people saying about', etc.
"""
topic = topic.strip()
# Remove common leading phrases
prefixes = [
r"^last \d+ days?\s+",
r"^what(?:'s| is| are) (?:people saying about|happening with|going on with)\s+",
r"^how (?:is|are)\s+",
r"^tell me about\s+",
r"^research\s+",
]
for pattern in prefixes:
topic = re.sub(pattern, "", topic, flags=re.IGNORECASE)
return topic.strip()
def _expand_queries(topic: str) -> List[str]:
"""Generate search queries to cast a wider net.
Strategy:
- Always include the core subject
- Add ALL individual words as standalone searches (not just first)
- Include the full topic if different from core
- Cap at 6 queries, dedupe
"""
core = _extract_core_subject(topic)
queries = [core]
# Add ALL individual words as separate queries
words = core.split()
if len(words) >= 2:
for word in words:
if len(word) > 1: # skip single-char words
queries.append(word)
# Add the full topic if different from core
if topic.lower().strip() != core.lower():
queries.append(topic.strip())
# Dedupe while preserving order, cap at 6
seen = set()
unique = []
for q in queries:
q_lower = q.lower().strip()
if q_lower and q_lower not in seen:
seen.add(q_lower)
unique.append(q.strip())
return unique[:6]
_GENERIC_TAGS = frozenset({"sports", "politics", "crypto", "science", "culture", "pop culture"})
def _extract_domain_queries(topic: str, events: List[Dict]) -> List[str]:
"""Extract domain-indicator search terms from first-pass event tags.
Uses structured tag metadata from Gamma API events to discover broader
domain categories (e.g., 'NCAA CBB' from a Big 12 basketball event).
Falls back to frequent title bigrams if no useful tags exist.
"""
query_words = set(_extract_core_subject(topic).lower().split())
# Collect tag labels from all first-pass events, count occurrences
tag_counts: Dict[str, int] = {}
for event in events:
tags = event.get("tags") or []
for tag in tags:
label = tag.get("label", "") if isinstance(tag, dict) else str(tag)
if not label:
continue
label_lower = label.lower()
# Skip generic category tags and tags matching existing queries
if label_lower in _GENERIC_TAGS:
continue
if label_lower in query_words:
continue
tag_counts[label] = tag_counts.get(label, 0) + 1
# Sort by frequency, take top 2 that appear in 2+ events
domain_queries = [
label for label, count in sorted(tag_counts.items(), key=lambda x: -x[1])
if count >= 2
][:2]
return domain_queries
def _search_single_query(query: str, page: int = 1) -> Dict[str, Any]:
"""Run a single search query against Gamma API."""
params = {"q": query, "page": str(page)}
url = f"{GAMMA_SEARCH_URL}?{urlencode(params)}"
try:
response = http.request("GET", url, timeout=15, retries=2)
return response
except http.HTTPError as e:
_log(f"Search failed for '{query}' page {page}: {e}")
return {"events": [], "error": str(e)}
except Exception as e:
_log(f"Search failed for '{query}' page {page}: {e}")
return {"events": [], "error": str(e)}
def _run_queries_parallel(
queries: List[str], pages: int, all_events: Dict, errors: List, start_idx: int = 0,
) -> None:
"""Run (query, page) combinations in parallel, merging into all_events."""
with ThreadPoolExecutor(max_workers=min(8, len(queries) * pages)) as executor:
futures = {}
for i, q in enumerate(queries, start=start_idx):
for p in range(1, pages + 1):
future = executor.submit(_search_single_query, q, p)
futures[future] = i
for future in as_completed(futures):
query_idx = futures[future]
try:
response = future.result(timeout=15)
if response.get("error"):
errors.append(response["error"])
events = response.get("events", [])
for event in events:
event_id = event.get("id", "")
if not event_id:
continue
if event_id not in all_events:
all_events[event_id] = (event, query_idx)
elif query_idx < all_events[event_id][1]:
all_events[event_id] = (event, query_idx)
except Exception as e:
errors.append(str(e))
def search_polymarket(
topic: str,
from_date: str,
to_date: str,
depth: str = "default",
) -> Dict[str, Any]:
"""Search Polymarket via Gamma API with two-pass query expansion.
Pass 1: Run expanded queries in parallel, merge and dedupe by event ID.
Pass 2: Extract domain-indicator terms from first-pass titles, search those.
Args:
topic: Search topic
from_date: Start date (YYYY-MM-DD) - used for activity filtering
to_date: End date (YYYY-MM-DD)
depth: 'quick', 'default', or 'deep'
Returns:
Dict with 'events' list and optional 'error'.
"""
pages = DEPTH_CONFIG.get(depth, DEPTH_CONFIG["default"])
cap = RESULT_CAP.get(depth, RESULT_CAP["default"])
queries = _expand_queries(topic)
_log(f"Searching for '{topic}' with queries: {queries} (pages={pages})")
# Pass 1: run expanded queries in parallel
all_events: Dict[str, tuple] = {}
errors: List[str] = []
_run_queries_parallel(queries, pages, all_events, errors)
# Pass 2: extract domain-indicator terms from first-pass titles and search
first_pass_events = [ev for ev, _ in all_events.values()]
domain_queries = _extract_domain_queries(topic, first_pass_events)
# Filter out queries we already ran
seen_queries = {q.lower() for q in queries}
domain_queries = [dq for dq in domain_queries if dq.lower() not in seen_queries]
if domain_queries:
_log(f"Domain expansion queries: {domain_queries}")
_run_queries_parallel(domain_queries, 1, all_events, errors, start_idx=len(queries))
merged_events = [ev for ev, _ in sorted(all_events.values(), key=lambda x: x[1])]
total_queries = len(queries) + len(domain_queries)
_log(f"Found {len(merged_events)} unique events across {total_queries} queries")
result = {"events": merged_events, "_cap": cap}
if errors and not merged_events:
result["error"] = "; ".join(errors[:2])
return result
def _format_price_movement(market: Dict[str, Any]) -> Optional[str]:
"""Pick the most significant price change and format it.
Returns string like 'down 11.7% this month' or None if no significant change.
"""
changes = [
(abs(market.get("oneDayPriceChange") or 0), market.get("oneDayPriceChange"), "today"),
(abs(market.get("oneWeekPriceChange") or 0), market.get("oneWeekPriceChange"), "this week"),
(abs(market.get("oneMonthPriceChange") or 0), market.get("oneMonthPriceChange"), "this month"),
]
# Pick the largest absolute change
changes.sort(key=lambda x: x[0], reverse=True)
abs_change, raw_change, period = changes[0]
# Skip if change is less than 1% (noise)
if abs_change < 0.01:
return None
direction = "up" if raw_change > 0 else "down"
pct = abs_change * 100
return f"{direction} {pct:.1f}% {period}"
def _parse_outcome_prices(market: Dict[str, Any]) -> List[tuple]:
"""Parse outcomePrices JSON string into list of (outcome_name, price) tuples."""
outcomes_raw = market.get("outcomes") or []
prices_raw = market.get("outcomePrices")
if not prices_raw:
return []
# Both outcomes and outcomePrices can be JSON-encoded strings
try:
if isinstance(outcomes_raw, str):
outcomes = json.loads(outcomes_raw)
else:
outcomes = outcomes_raw
except (json.JSONDecodeError, TypeError):
outcomes = []
try:
if isinstance(prices_raw, str):
prices = json.loads(prices_raw)
else:
prices = prices_raw
except (json.JSONDecodeError, TypeError):
return []
result = []
for i, price in enumerate(prices):
try:
p = float(price)
except (ValueError, TypeError):
continue
name = outcomes[i] if i < len(outcomes) else f"Outcome {i+1}"
result.append((name, p))
return result
def _shorten_question(question: str) -> str:
"""Extract a short display name from a market question.
'Will Arizona win the 2026 NCAA Tournament?' -> 'Arizona'
'Will Duke be a number 1 seed in the 2026 NCAA...' -> 'Duke'
"""
q = question.strip().rstrip("?")
# Common patterns: "Will X win/be/...", "X wins/loses..."
m = re.match(r"^Will\s+(.+?)\s+(?:win|be|make|reach|have|lose|qualify|advance|strike|agree|pass|sign|get|become|remain|stay|leave|survive|next)\b", q, re.IGNORECASE)
if m:
return m.group(1).strip()
m = re.match(r"^Will\s+(.+?)\s+", q, re.IGNORECASE)
if m and len(m.group(1).split()) <= 4:
return m.group(1).strip()
# Fallback: truncate
return question[:40] if len(question) > 40 else question
def _compute_text_similarity(topic: str, title: str, outcomes: List[str] = None) -> float:
"""Score how well the event title (or outcome names) match the search topic.
Returns 0.0-1.0. Title substring match gets 1.0, outcome match gets 0.85/0.7,
title token overlap gets proportional score.
"""
core = _extract_core_subject(topic).lower()
title_lower = title.lower()
if not core:
return 0.5
# Full substring match in title
if core in title_lower:
return 1.0
# Check if topic appears in any outcome name (bidirectional)
if outcomes:
core_tokens = set(core.split())
best_outcome_score = 0.0
for outcome_name in outcomes:
outcome_lower = outcome_name.lower()
# Bidirectional: "arizona" in "arizona basketball" OR "arizona basketball" contains "arizona"
if core in outcome_lower or outcome_lower in core:
best_outcome_score = max(best_outcome_score, 0.85)
elif core_tokens & set(outcome_lower.split()):
best_outcome_score = max(best_outcome_score, 0.7)
if best_outcome_score > 0:
return best_outcome_score
# Token overlap fallback against title
topic_tokens = set(core.split())
title_tokens = set(title_lower.split())
if not topic_tokens:
return 0.5
overlap = len(topic_tokens & title_tokens)
return overlap / len(topic_tokens)
def _safe_float(val, default=0.0) -> float:
"""Safely convert a value to float."""
try:
return float(val or default)
except (ValueError, TypeError):
return default
def parse_polymarket_response(response: Dict[str, Any], topic: str = "") -> List[Dict[str, Any]]:
"""Parse Gamma API response into normalized item dicts.
Each event becomes one item showing its title and top markets.
Args:
response: Raw Gamma API response
topic: Original search topic (for relevance scoring)
Returns:
List of item dicts ready for normalization.
"""
events = response.get("events", [])
items = []
for i, event in enumerate(events):
event_id = event.get("id", "")
title = event.get("title", "")
slug = event.get("slug", "")
# Filter: skip closed/resolved events
if event.get("closed", False):
continue
if not event.get("active", True):
continue
# Get markets for this event
markets = event.get("markets", [])
if not markets:
continue
# Filter to active, open markets with liquidity (excludes resolved markets)
active_markets = []
for m in markets:
if m.get("closed", False):
continue
if not m.get("active", True):
continue
# Must have liquidity (resolved markets have 0 or None)
try:
liq = float(m.get("liquidity", 0) or 0)
except (ValueError, TypeError):
liq = 0
if liq > 0:
active_markets.append(m)
if not active_markets:
continue
# Sort markets by volume (most liquid first)
def market_volume(m):
try:
return float(m.get("volume", 0) or 0)
except (ValueError, TypeError):
return 0
active_markets.sort(key=market_volume, reverse=True)
# Take top market for the event
top_market = active_markets[0]
# Collect outcome names from ALL active markets (not just top) for similarity scoring
# Filter to outcomes with price > 1% to avoid noise
# Also extract subjects from market questions for neg-risk events (outcomes are Yes/No)
all_outcome_names = []
for m in active_markets:
for name, price in _parse_outcome_prices(m):
if price > 0.01 and name not in all_outcome_names:
all_outcome_names.append(name)
# For neg-risk binary markets (Yes/No outcomes), the team/entity name
# lives in the question, e.g., "Will Arizona win the NCAA Tournament?"
question = m.get("question", "")
if question and question != title:
all_outcome_names.append(question)
# Parse outcome prices - for multi-market events with Yes/No binary
# sub-markets, synthesize from market questions to show actual
# team/entity probabilities instead of a single market's Yes/No
outcome_prices = _parse_outcome_prices(top_market)
top_outcomes_are_binary = (
len(outcome_prices) == 2
and {n.lower() for n, _ in outcome_prices} == {"yes", "no"}
)
if top_outcomes_are_binary and len(active_markets) > 1:
synth_outcomes = []
for m in active_markets:
q = m.get("question", "")
if not q:
continue
pairs = _parse_outcome_prices(m)
yes_price = next((p for name, p in pairs if name.lower() == "yes"), None)
if yes_price is not None and yes_price > 0.005:
synth_outcomes.append((q, yes_price))
if synth_outcomes:
synth_outcomes.sort(key=lambda x: x[1], reverse=True)
outcome_prices = [(_shorten_question(q), p) for q, p in synth_outcomes]
# Format price movement
price_movement = _format_price_movement(top_market)
# Volume and liquidity - prefer event-level (more stable), fall back to market-level
event_volume1mo = _safe_float(event.get("volume1mo"))
event_volume1wk = _safe_float(event.get("volume1wk"))
event_liquidity = _safe_float(event.get("liquidity"))
event_competitive = _safe_float(event.get("competitive"))
volume24hr = _safe_float(event.get("volume24hr")) or _safe_float(top_market.get("volume24hr"))
liquidity = event_liquidity or _safe_float(top_market.get("liquidity"))
# Event URL
url = f"https://polymarket.com/event/{slug}" if slug else f"https://polymarket.com/event/{event_id}"
# Date: use updatedAt from event
updated_at = event.get("updatedAt", "")
date_str = None
if updated_at:
try:
date_str = updated_at[:10] # YYYY-MM-DD
except (IndexError, TypeError):
pass
# End date for the market
end_date = top_market.get("endDate")
if end_date:
try:
end_date = end_date[:10]
except (IndexError, TypeError):
end_date = None
# Quality-signal relevance (replaces position-based decay)
text_score = _compute_text_similarity(topic, title, all_outcome_names) if topic else 0.5
# Volume signal: log-scaled monthly volume (most stable signal)
vol_raw = event_volume1mo or event_volume1wk or volume24hr
vol_score = min(1.0, math.log1p(vol_raw) / 16) # ~$9M = 1.0
# Liquidity signal
liq_score = min(1.0, math.log1p(liquidity) / 14) # ~$1.2M = 1.0
# Price movement: daily weighted more than monthly
day_change = abs(top_market.get("oneDayPriceChange") or 0) * 3
week_change = abs(top_market.get("oneWeekPriceChange") or 0) * 2
month_change = abs(top_market.get("oneMonthPriceChange") or 0)
max_change = max(day_change, week_change, month_change)
movement_score = min(1.0, max_change * 5) # 20% change = 1.0
# Competitive bonus: markets near 50/50 are more interesting
competitive_score = event_competitive
relevance = min(1.0, (
0.30 * text_score +
0.30 * vol_score +
0.15 * liq_score +
0.15 * movement_score +
0.10 * competitive_score
))
# Surface the topic-matching outcome to the front before truncating
if topic and outcome_prices:
core = _extract_core_subject(topic).lower()
core_tokens = set(core.split())
reordered = []
rest = []
for pair in outcome_prices:
name_lower = pair[0].lower()
# Match if full core is substring, or name is substring of core,
# or any core token appears in the name (handles long question strings)
if (core in name_lower or name_lower in core
or any(tok in name_lower for tok in core_tokens if len(tok) > 2)):
reordered.append(pair)
else:
rest.append(pair)
if reordered:
outcome_prices = reordered + rest
# Top 3 outcomes for multi-outcome markets
top_outcomes = outcome_prices[:3]
remaining = len(outcome_prices) - 3
if remaining < 0:
remaining = 0
items.append({
"event_id": event_id,
"title": title,
"question": top_market.get("question", title),
"url": url,
"outcome_prices": top_outcomes,
"outcomes_remaining": remaining,
"price_movement": price_movement,
"volume24hr": volume24hr,
"volume1mo": event_volume1mo,
"liquidity": liquidity,
"date": date_str,
"end_date": end_date,
"relevance": round(relevance, 2),
"why_relevant": f"Prediction market: {title[:60]}",
})
# Sort by relevance (quality-signal ranked) and apply cap
items.sort(key=lambda x: x["relevance"], reverse=True)
cap = response.get("_cap", len(items))
return items[:cap]

View File

@@ -0,0 +1,256 @@
"""Reddit thread enrichment with real engagement metrics."""
import re
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse
from . import http, dates
def extract_reddit_path(url: str) -> Optional[str]:
"""Extract the path from a Reddit URL.
Args:
url: Reddit URL
Returns:
Path component or None
"""
try:
parsed = urlparse(url)
if "reddit.com" not in parsed.netloc:
return None
return parsed.path
except:
return None
class RedditRateLimitError(Exception):
"""Raised when Reddit returns HTTP 429 (rate limited)."""
pass
def fetch_thread_data(
url: str,
mock_data: Optional[Dict] = None,
timeout: int = 30,
retries: int = 3,
) -> Optional[Dict[str, Any]]:
"""Fetch Reddit thread JSON data.
Args:
url: Reddit thread URL
mock_data: Mock data for testing
timeout: HTTP timeout per attempt in seconds
retries: Number of retries on failure
Returns:
Thread data dict or None on failure
Raises:
RedditRateLimitError: When Reddit returns 429 (caller should bail)
"""
if mock_data is not None:
return mock_data
path = extract_reddit_path(url)
if not path:
return None
try:
data = http.get_reddit_json(path, timeout=timeout, retries=retries)
return data
except http.HTTPError as e:
if e.status_code == 429:
raise RedditRateLimitError(f"Reddit rate limited (429) fetching {url}") from e
return None
def parse_thread_data(data: Any) -> Dict[str, Any]:
"""Parse Reddit thread JSON into structured data.
Args:
data: Raw Reddit JSON response
Returns:
Dict with submission and comments data
"""
result = {
"submission": None,
"comments": [],
}
if not isinstance(data, list) or len(data) < 1:
return result
# First element is submission listing
submission_listing = data[0]
if isinstance(submission_listing, dict):
children = submission_listing.get("data", {}).get("children", [])
if children:
sub_data = children[0].get("data", {})
result["submission"] = {
"score": sub_data.get("score"),
"num_comments": sub_data.get("num_comments"),
"upvote_ratio": sub_data.get("upvote_ratio"),
"created_utc": sub_data.get("created_utc"),
"permalink": sub_data.get("permalink"),
"title": sub_data.get("title"),
"selftext": sub_data.get("selftext", "")[:500], # Truncate
}
# Second element is comments listing
if len(data) >= 2:
comments_listing = data[1]
if isinstance(comments_listing, dict):
children = comments_listing.get("data", {}).get("children", [])
for child in children:
if child.get("kind") != "t1": # t1 = comment
continue
c_data = child.get("data", {})
if not c_data.get("body"):
continue
comment = {
"score": c_data.get("score", 0),
"created_utc": c_data.get("created_utc"),
"author": c_data.get("author", "[deleted]"),
"body": c_data.get("body", "")[:300], # Truncate
"permalink": c_data.get("permalink"),
}
result["comments"].append(comment)
return result
def get_top_comments(comments: List[Dict], limit: int = 10) -> List[Dict[str, Any]]:
"""Get top comments sorted by score.
Args:
comments: List of comment dicts
limit: Maximum number to return
Returns:
Top comments sorted by score
"""
# Filter out deleted/removed
valid = [c for c in comments if c.get("author") not in ("[deleted]", "[removed]")]
# Sort by score descending
sorted_comments = sorted(valid, key=lambda c: c.get("score", 0), reverse=True)
return sorted_comments[:limit]
def extract_comment_insights(comments: List[Dict], limit: int = 7) -> List[str]:
"""Extract key insights from top comments.
Uses simple heuristics to identify valuable comments:
- Has substantive text
- Contains actionable information
- Not just agreement/disagreement
Args:
comments: Top comments
limit: Max insights to extract
Returns:
List of insight strings
"""
insights = []
for comment in comments[:limit * 2]: # Look at more comments than we need
body = comment.get("body", "").strip()
if not body or len(body) < 30:
continue
# Skip low-value patterns
skip_patterns = [
r'^(this|same|agreed|exactly|yep|nope|yes|no|thanks|thank you)\.?$',
r'^lol|lmao|haha',
r'^\[deleted\]',
r'^\[removed\]',
]
if any(re.match(p, body.lower()) for p in skip_patterns):
continue
# Truncate to first meaningful sentence or ~150 chars
insight = body[:150]
if len(body) > 150:
# Try to find a sentence boundary
for i, char in enumerate(insight):
if char in '.!?' and i > 50:
insight = insight[:i+1]
break
else:
insight = insight.rstrip() + "..."
insights.append(insight)
if len(insights) >= limit:
break
return insights
def enrich_reddit_item(
item: Dict[str, Any],
mock_thread_data: Optional[Dict] = None,
timeout: int = 10,
retries: int = 1,
) -> Dict[str, Any]:
"""Enrich a Reddit item with real engagement data.
Args:
item: Reddit item dict
mock_thread_data: Mock data for testing
timeout: HTTP timeout per attempt (default 10s for enrichment)
retries: Number of retries (default 1 — fail fast for enrichment)
Returns:
Enriched item dict
Raises:
RedditRateLimitError: Propagated so caller can bail on remaining items
"""
url = item.get("url", "")
# Fetch thread data (RedditRateLimitError propagates to caller)
thread_data = fetch_thread_data(url, mock_thread_data, timeout=timeout, retries=retries)
if not thread_data:
return item
parsed = parse_thread_data(thread_data)
submission = parsed.get("submission")
comments = parsed.get("comments", [])
# Update engagement metrics
if submission:
item["engagement"] = {
"score": submission.get("score"),
"num_comments": submission.get("num_comments"),
"upvote_ratio": submission.get("upvote_ratio"),
}
# Update date from actual data
created_utc = submission.get("created_utc")
if created_utc:
item["date"] = dates.timestamp_to_date(created_utc)
# Get top comments
top_comments = get_top_comments(comments)
item["top_comments"] = []
for c in top_comments:
permalink = c.get("permalink", "")
comment_url = f"https://reddit.com{permalink}" if permalink else ""
item["top_comments"].append({
"score": c.get("score", 0),
"date": dates.timestamp_to_date(c.get("created_utc")),
"author": c.get("author", ""),
"excerpt": c.get("body", "")[:200],
"url": comment_url,
})
# Extract insights
item["comment_insights"] = extract_comment_insights(top_comments)
return item

View File

@@ -0,0 +1,680 @@
"""Output rendering for last30days skill."""
import json
import os
import tempfile
from pathlib import Path
from typing import Optional
from . import schema
OUTPUT_DIR = Path.home() / ".local" / "share" / "last30days" / "out"
def _xref_tag(item) -> str:
"""Return ' [also on: Reddit, HN]' string if item has cross_refs, else ''."""
refs = getattr(item, 'cross_refs', None)
if not refs:
return ""
source_names = set()
for ref_id in refs:
if ref_id.startswith('R'):
source_names.add('Reddit')
elif ref_id.startswith('X'):
source_names.add('X')
elif ref_id.startswith('YT'):
source_names.add('YouTube')
elif ref_id.startswith('HN'):
source_names.add('HN')
elif ref_id.startswith('PM'):
source_names.add('Polymarket')
elif ref_id.startswith('W'):
source_names.add('Web')
if source_names:
return f" [also on: {', '.join(sorted(source_names))}]"
return ""
def ensure_output_dir():
"""Ensure output directory exists. Supports env override and sandbox fallback."""
global OUTPUT_DIR
env_dir = os.environ.get("LAST30DAYS_OUTPUT_DIR")
if env_dir:
OUTPUT_DIR = Path(env_dir)
try:
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
except PermissionError:
OUTPUT_DIR = Path(tempfile.gettempdir()) / "last30days" / "out"
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
def _assess_data_freshness(report: schema.Report) -> dict:
"""Assess how much data is actually from the last 30 days."""
reddit_recent = sum(1 for r in report.reddit if r.date and r.date >= report.range_from)
x_recent = sum(1 for x in report.x if x.date and x.date >= report.range_from)
web_recent = sum(1 for w in report.web if w.date and w.date >= report.range_from)
hn_recent = sum(1 for h in report.hackernews if h.date and h.date >= report.range_from)
pm_recent = sum(1 for p in report.polymarket if p.date and p.date >= report.range_from)
total_recent = reddit_recent + x_recent + web_recent + hn_recent + pm_recent
total_items = len(report.reddit) + len(report.x) + len(report.web) + len(report.hackernews) + len(report.polymarket)
return {
"reddit_recent": reddit_recent,
"x_recent": x_recent,
"web_recent": web_recent,
"total_recent": total_recent,
"total_items": total_items,
"is_sparse": total_recent < 5,
"mostly_evergreen": total_items > 0 and total_recent < total_items * 0.3,
}
def render_compact(report: schema.Report, limit: int = 15, missing_keys: str = "none") -> str:
"""Render compact output for the assistant to synthesize.
Args:
report: Report data
limit: Max items per source
missing_keys: 'both', 'reddit', 'x', or 'none'
Returns:
Compact markdown string
"""
lines = []
# Header
lines.append(f"## Research Results: {report.topic}")
lines.append("")
# Assess data freshness and add honesty warning if needed
freshness = _assess_data_freshness(report)
if freshness["is_sparse"]:
lines.append("**⚠️ LIMITED RECENT DATA** - Few discussions from the last 30 days.")
lines.append(f"Only {freshness['total_recent']} item(s) confirmed from {report.range_from} to {report.range_to}.")
lines.append("Results below may include older/evergreen content. Be transparent with the user about this.")
lines.append("")
# Web-only mode banner (when no API keys)
if report.mode == "web-only":
lines.append("**🌐 WEB SEARCH MODE** - assistant will search blogs, docs & news")
lines.append("")
lines.append("---")
lines.append("**⚡ Want better results?** Add API keys or sign in to Codex to unlock Reddit & X data:")
lines.append("- `OPENAI_API_KEY` or `codex login` → Reddit threads with real upvotes & comments")
lines.append("- `XAI_API_KEY` → X posts with real likes & reposts")
lines.append("- Edit `~/.config/last30days/.env` to add keys")
lines.append("- If already signed in but still seeing this, re-run `codex login`")
lines.append("---")
lines.append("")
# Cache indicator
if report.from_cache:
age_str = f"{report.cache_age_hours:.1f}h old" if report.cache_age_hours else "cached"
lines.append(f"**⚡ CACHED RESULTS** ({age_str}) - use `--refresh` for fresh data")
lines.append("")
lines.append(f"**Date Range:** {report.range_from} to {report.range_to}")
lines.append(f"**Mode:** {report.mode}")
if report.openai_model_used:
lines.append(f"**OpenAI Model:** {report.openai_model_used}")
if report.xai_model_used:
lines.append(f"**xAI Model:** {report.xai_model_used}")
if report.resolved_x_handle:
lines.append(f"**Resolved X Handle:** @{report.resolved_x_handle}")
lines.append("")
# Coverage note for partial coverage
if report.mode == "reddit-only" and missing_keys in ("x", "none"):
lines.append("*💡 Tip: Add an xAI key (`XAI_API_KEY`) for X/Twitter data and better triangulation.*")
lines.append("")
elif report.mode == "x-only" and missing_keys in ("reddit", "none"):
lines.append("*💡 Tip: Add OPENAI_API_KEY or run `codex login` for Reddit data and better triangulation. If already signed in, re-run `codex login`.*")
lines.append("")
# Reddit items
if report.reddit_error:
lines.append("### Reddit Threads")
lines.append("")
lines.append(f"**ERROR:** {report.reddit_error}")
lines.append("")
elif report.mode in ("both", "reddit-only") and not report.reddit:
lines.append("### Reddit Threads")
lines.append("")
lines.append("*No relevant Reddit threads found for this topic.*")
lines.append("")
elif report.reddit:
lines.append("### Reddit Threads")
lines.append("")
for item in report.reddit[:limit]:
eng_str = ""
if item.engagement:
eng = item.engagement
parts = []
if eng.score is not None:
parts.append(f"{eng.score}pts")
if eng.num_comments is not None:
parts.append(f"{eng.num_comments}cmt")
if parts:
eng_str = f" [{', '.join(parts)}]"
date_str = f" ({item.date})" if item.date else " (date unknown)"
conf_str = f" [date:{item.date_confidence}]" if item.date_confidence != "high" else ""
lines.append(f"**{item.id}** (score:{item.score}) r/{item.subreddit}{date_str}{conf_str}{eng_str}{_xref_tag(item)}")
lines.append(f" {item.title}")
lines.append(f" {item.url}")
lines.append(f" *{item.why_relevant}*")
# Top comment insights
if item.comment_insights:
lines.append(" Insights:")
for insight in item.comment_insights[:3]:
lines.append(f" - {insight}")
lines.append("")
# X items
if report.x_error:
lines.append("### X Posts")
lines.append("")
lines.append(f"**ERROR:** {report.x_error}")
lines.append("")
elif report.mode in ("both", "x-only", "all", "x-web") and not report.x:
lines.append("### X Posts")
lines.append("")
lines.append("*No relevant X posts found for this topic.*")
lines.append("")
elif report.x:
lines.append("### X Posts")
lines.append("")
for item in report.x[:limit]:
eng_str = ""
if item.engagement:
eng = item.engagement
parts = []
if eng.likes is not None:
parts.append(f"{eng.likes}likes")
if eng.reposts is not None:
parts.append(f"{eng.reposts}rt")
if parts:
eng_str = f" [{', '.join(parts)}]"
date_str = f" ({item.date})" if item.date else " (date unknown)"
conf_str = f" [date:{item.date_confidence}]" if item.date_confidence != "high" else ""
lines.append(f"**{item.id}** (score:{item.score}) @{item.author_handle}{date_str}{conf_str}{eng_str}{_xref_tag(item)}")
lines.append(f" {item.text[:200]}...")
lines.append(f" {item.url}")
lines.append(f" *{item.why_relevant}*")
lines.append("")
# YouTube items
if report.youtube_error:
lines.append("### YouTube Videos")
lines.append("")
lines.append(f"**ERROR:** {report.youtube_error}")
lines.append("")
elif report.youtube:
lines.append("### YouTube Videos")
lines.append("")
for item in report.youtube[:limit]:
eng_str = ""
if item.engagement:
eng = item.engagement
parts = []
if eng.views is not None:
parts.append(f"{eng.views:,} views")
if eng.likes is not None:
parts.append(f"{eng.likes:,} likes")
if parts:
eng_str = f" [{', '.join(parts)}]"
date_str = f" ({item.date})" if item.date else ""
lines.append(f"**{item.id}** (score:{item.score}) {item.channel_name}{date_str}{eng_str}{_xref_tag(item)}")
lines.append(f" {item.title}")
lines.append(f" {item.url}")
if item.transcript_snippet:
snippet = item.transcript_snippet[:200]
if len(item.transcript_snippet) > 200:
snippet += "..."
lines.append(f" Transcript: {snippet}")
lines.append(f" *{item.why_relevant}*")
lines.append("")
# Hacker News items
if report.hackernews_error:
lines.append("### Hacker News Stories")
lines.append("")
lines.append(f"**ERROR:** {report.hackernews_error}")
lines.append("")
elif report.hackernews:
lines.append("### Hacker News Stories")
lines.append("")
for item in report.hackernews[:limit]:
eng_str = ""
if item.engagement:
eng = item.engagement
parts = []
if eng.score is not None:
parts.append(f"{eng.score}pts")
if eng.num_comments is not None:
parts.append(f"{eng.num_comments}cmt")
if parts:
eng_str = f" [{', '.join(parts)}]"
date_str = f" ({item.date})" if item.date else ""
lines.append(f"**{item.id}** (score:{item.score}) hn/{item.author}{date_str}{eng_str}{_xref_tag(item)}")
lines.append(f" {item.title}")
lines.append(f" {item.hn_url}")
lines.append(f" *{item.why_relevant}*")
# Comment insights
if item.comment_insights:
lines.append(f" Insights:")
for insight in item.comment_insights[:3]:
lines.append(f" - {insight}")
lines.append("")
# Polymarket items
if report.polymarket_error:
lines.append("### Prediction Markets (Polymarket)")
lines.append("")
lines.append(f"**ERROR:** {report.polymarket_error}")
lines.append("")
elif report.polymarket:
lines.append("### Prediction Markets (Polymarket)")
lines.append("")
for item in report.polymarket[:limit]:
eng_str = ""
if item.engagement:
eng = item.engagement
parts = []
if eng.volume is not None:
if eng.volume >= 1_000_000:
parts.append(f"${eng.volume/1_000_000:.1f}M volume")
elif eng.volume >= 1_000:
parts.append(f"${eng.volume/1_000:.0f}K volume")
else:
parts.append(f"${eng.volume:.0f} volume")
if eng.liquidity is not None:
if eng.liquidity >= 1_000_000:
parts.append(f"${eng.liquidity/1_000_000:.1f}M liquidity")
elif eng.liquidity >= 1_000:
parts.append(f"${eng.liquidity/1_000:.0f}K liquidity")
else:
parts.append(f"${eng.liquidity:.0f} liquidity")
if parts:
eng_str = f" [{', '.join(parts)}]"
date_str = f" ({item.date})" if item.date else ""
lines.append(f"**{item.id}** (score:{item.score}){eng_str}{_xref_tag(item)}")
lines.append(f" {item.question}")
# Outcome prices with price movement
if item.outcome_prices:
outcomes = []
for name, price in item.outcome_prices:
pct = price * 100
outcomes.append(f"{name}: {pct:.0f}%")
outcome_line = " | ".join(outcomes)
if item.outcomes_remaining > 0:
outcome_line += f" and {item.outcomes_remaining} more"
if item.price_movement:
outcome_line += f" ({item.price_movement})"
lines.append(f" {outcome_line}")
lines.append(f" {item.url}")
lines.append(f" *{item.why_relevant}*")
lines.append("")
# Web items (if any - populated by the assistant)
if report.web_error:
lines.append("### Web Results")
lines.append("")
lines.append(f"**ERROR:** {report.web_error}")
lines.append("")
elif report.web:
lines.append("### Web Results")
lines.append("")
for item in report.web[:limit]:
date_str = f" ({item.date})" if item.date else " (date unknown)"
conf_str = f" [date:{item.date_confidence}]" if item.date_confidence != "high" else ""
lines.append(f"**{item.id}** [WEB] (score:{item.score}) {item.source_domain}{date_str}{conf_str}{_xref_tag(item)}")
lines.append(f" {item.title}")
lines.append(f" {item.url}")
lines.append(f" {item.snippet[:150]}...")
lines.append(f" *{item.why_relevant}*")
lines.append("")
return "\n".join(lines)
def render_source_status(report: schema.Report, source_info: dict = None) -> str:
"""Render source status footer showing what was used/skipped and why.
Args:
report: Report data
source_info: Dict with source availability info:
x_skip_reason, youtube_skip_reason, web_skip_reason
Returns:
Source status markdown string
"""
if source_info is None:
source_info = {}
lines = []
lines.append("---")
lines.append("**Sources:**")
# Reddit
if report.reddit_error:
lines.append(f" ❌ Reddit: error — {report.reddit_error}")
elif report.reddit:
lines.append(f" ✅ Reddit: {len(report.reddit)} threads")
elif report.mode in ("both", "reddit-only", "all", "reddit-web"):
pass # Hide zero-result sources
else:
reason = source_info.get("reddit_skip_reason", "not configured")
lines.append(f" ⏭️ Reddit: skipped — {reason}")
# X
if report.x_error:
lines.append(f" ❌ X: error — {report.x_error}")
elif report.x:
x_line = f" ✅ X: {len(report.x)} posts"
if report.resolved_x_handle:
x_line += f" (via @{report.resolved_x_handle} + keyword search)"
lines.append(x_line)
elif report.mode in ("both", "x-only", "all", "x-web"):
pass # Hide zero-result sources
else:
reason = source_info.get("x_skip_reason", "No Bird CLI or XAI_API_KEY")
lines.append(f" ⏭️ X: skipped — {reason}")
# YouTube
if report.youtube_error:
lines.append(f" ❌ YouTube: error — {report.youtube_error}")
elif report.youtube:
with_transcripts = sum(1 for v in report.youtube if getattr(v, 'transcript_snippet', None))
lines.append(f" ✅ YouTube: {len(report.youtube)} videos ({with_transcripts} with transcripts)")
# Hide when zero results (no skip reason line needed)
# Hacker News
if report.hackernews_error:
lines.append(f" ❌ HN: error - {report.hackernews_error}")
elif report.hackernews:
lines.append(f" ✅ HN: {len(report.hackernews)} stories")
# Hide when zero results
# Polymarket
if report.polymarket_error:
lines.append(f" ❌ Polymarket: error - {report.polymarket_error}")
elif report.polymarket:
lines.append(f" ✅ Polymarket: {len(report.polymarket)} markets")
# Hide when zero results
# Web
if report.web_error:
lines.append(f" ❌ Web: error — {report.web_error}")
elif report.web:
lines.append(f" ✅ Web: {len(report.web)} pages")
else:
reason = source_info.get("web_skip_reason", "assistant will use WebSearch")
lines.append(f" ⚡ Web: {reason}")
lines.append("")
return "\n".join(lines)
def render_context_snippet(report: schema.Report) -> str:
"""Render reusable context snippet.
Args:
report: Report data
Returns:
Context markdown string
"""
lines = []
lines.append(f"# Context: {report.topic} (Last 30 Days)")
lines.append("")
lines.append(f"*Generated: {report.generated_at[:10]} | Sources: {report.mode}*")
lines.append("")
# Key sources summary
lines.append("## Key Sources")
lines.append("")
all_items = []
for item in report.reddit[:5]:
all_items.append((item.score, "Reddit", item.title, item.url))
for item in report.x[:5]:
all_items.append((item.score, "X", item.text[:50] + "...", item.url))
for item in report.hackernews[:5]:
all_items.append((item.score, "HN", item.title[:50] + "...", item.hn_url))
for item in report.polymarket[:5]:
all_items.append((item.score, "Polymarket", item.question[:50] + "...", item.url))
for item in report.web[:5]:
all_items.append((item.score, "Web", item.title[:50] + "...", item.url))
all_items.sort(key=lambda x: -x[0])
for score, source, text, url in all_items[:7]:
lines.append(f"- [{source}] {text}")
lines.append("")
lines.append("## Summary")
lines.append("")
lines.append("*See full report for best practices, prompt pack, and detailed sources.*")
lines.append("")
return "\n".join(lines)
def render_full_report(report: schema.Report) -> str:
"""Render full markdown report.
Args:
report: Report data
Returns:
Full report markdown
"""
lines = []
# Title
lines.append(f"# {report.topic} - Last 30 Days Research Report")
lines.append("")
lines.append(f"**Generated:** {report.generated_at}")
lines.append(f"**Date Range:** {report.range_from} to {report.range_to}")
lines.append(f"**Mode:** {report.mode}")
lines.append("")
# Models
lines.append("## Models Used")
lines.append("")
if report.openai_model_used:
lines.append(f"- **OpenAI:** {report.openai_model_used}")
if report.xai_model_used:
lines.append(f"- **xAI:** {report.xai_model_used}")
lines.append("")
# Reddit section
if report.reddit:
lines.append("## Reddit Threads")
lines.append("")
for item in report.reddit:
lines.append(f"### {item.id}: {item.title}")
lines.append("")
lines.append(f"- **Subreddit:** r/{item.subreddit}")
lines.append(f"- **URL:** {item.url}")
lines.append(f"- **Date:** {item.date or 'Unknown'} (confidence: {item.date_confidence})")
lines.append(f"- **Score:** {item.score}/100")
lines.append(f"- **Relevance:** {item.why_relevant}")
if item.engagement:
eng = item.engagement
lines.append(f"- **Engagement:** {eng.score or '?'} points, {eng.num_comments or '?'} comments")
if item.comment_insights:
lines.append("")
lines.append("**Key Insights from Comments:**")
for insight in item.comment_insights:
lines.append(f"- {insight}")
lines.append("")
# X section
if report.x:
lines.append("## X Posts")
lines.append("")
for item in report.x:
lines.append(f"### {item.id}: @{item.author_handle}")
lines.append("")
lines.append(f"- **URL:** {item.url}")
lines.append(f"- **Date:** {item.date or 'Unknown'} (confidence: {item.date_confidence})")
lines.append(f"- **Score:** {item.score}/100")
lines.append(f"- **Relevance:** {item.why_relevant}")
if item.engagement:
eng = item.engagement
lines.append(f"- **Engagement:** {eng.likes or '?'} likes, {eng.reposts or '?'} reposts")
lines.append("")
lines.append(f"> {item.text}")
lines.append("")
# HN section
if report.hackernews:
lines.append("## Hacker News Stories")
lines.append("")
for item in report.hackernews:
lines.append(f"### {item.id}: {item.title}")
lines.append("")
lines.append(f"- **Author:** {item.author}")
lines.append(f"- **HN URL:** {item.hn_url}")
if item.url:
lines.append(f"- **Article URL:** {item.url}")
lines.append(f"- **Date:** {item.date or 'Unknown'}")
lines.append(f"- **Score:** {item.score}/100")
lines.append(f"- **Relevance:** {item.why_relevant}")
if item.engagement:
eng = item.engagement
lines.append(f"- **Engagement:** {eng.score or '?'} points, {eng.num_comments or '?'} comments")
if item.comment_insights:
lines.append("")
lines.append("**Key Insights from Comments:**")
for insight in item.comment_insights:
lines.append(f"- {insight}")
lines.append("")
# Polymarket section
if report.polymarket:
lines.append("## Prediction Markets (Polymarket)")
lines.append("")
for item in report.polymarket:
lines.append(f"### {item.id}: {item.question}")
lines.append("")
lines.append(f"- **Event:** {item.title}")
lines.append(f"- **URL:** {item.url}")
lines.append(f"- **Date:** {item.date or 'Unknown'}")
lines.append(f"- **Score:** {item.score}/100")
if item.outcome_prices:
outcomes = [f"{name}: {price*100:.0f}%" for name, price in item.outcome_prices]
lines.append(f"- **Outcomes:** {' | '.join(outcomes)}")
if item.price_movement:
lines.append(f"- **Trend:** {item.price_movement}")
if item.engagement:
eng = item.engagement
lines.append(f"- **Volume:** ${eng.volume or 0:,.0f} | Liquidity: ${eng.liquidity or 0:,.0f}")
lines.append("")
# Web section
if report.web:
lines.append("## Web Results")
lines.append("")
for item in report.web:
lines.append(f"### {item.id}: {item.title}")
lines.append("")
lines.append(f"- **Source:** {item.source_domain}")
lines.append(f"- **URL:** {item.url}")
lines.append(f"- **Date:** {item.date or 'Unknown'} (confidence: {item.date_confidence})")
lines.append(f"- **Score:** {item.score}/100")
lines.append(f"- **Relevance:** {item.why_relevant}")
lines.append("")
lines.append(f"> {item.snippet}")
lines.append("")
# Placeholders for assistant synthesis
lines.append("## Best Practices")
lines.append("")
lines.append("*To be synthesized by assistant*")
lines.append("")
lines.append("## Prompt Pack")
lines.append("")
lines.append("*To be synthesized by assistant*")
lines.append("")
return "\n".join(lines)
def write_outputs(
report: schema.Report,
raw_openai: Optional[dict] = None,
raw_xai: Optional[dict] = None,
raw_reddit_enriched: Optional[list] = None,
):
"""Write all output files.
Args:
report: Report data
raw_openai: Raw OpenAI API response
raw_xai: Raw xAI API response
raw_reddit_enriched: Raw enriched Reddit thread data
"""
ensure_output_dir()
# report.json
with open(OUTPUT_DIR / "report.json", 'w') as f:
json.dump(report.to_dict(), f, indent=2)
# report.md
with open(OUTPUT_DIR / "report.md", 'w') as f:
f.write(render_full_report(report))
# last30days.context.md
with open(OUTPUT_DIR / "last30days.context.md", 'w') as f:
f.write(render_context_snippet(report))
# Raw responses
if raw_openai:
with open(OUTPUT_DIR / "raw_openai.json", 'w') as f:
json.dump(raw_openai, f, indent=2)
if raw_xai:
with open(OUTPUT_DIR / "raw_xai.json", 'w') as f:
json.dump(raw_xai, f, indent=2)
if raw_reddit_enriched:
with open(OUTPUT_DIR / "raw_reddit_threads_enriched.json", 'w') as f:
json.dump(raw_reddit_enriched, f, indent=2)
def get_context_path() -> str:
"""Get path to context file."""
return str(OUTPUT_DIR / "last30days.context.md")

View File

@@ -0,0 +1,586 @@
"""Data schemas for last30days skill."""
from dataclasses import dataclass, field, asdict
from typing import Any, Dict, List, Optional
from datetime import datetime, timezone
@dataclass
class Engagement:
"""Engagement metrics."""
# Reddit fields
score: Optional[int] = None
num_comments: Optional[int] = None
upvote_ratio: Optional[float] = None
# X fields
likes: Optional[int] = None
reposts: Optional[int] = None
replies: Optional[int] = None
quotes: Optional[int] = None
# YouTube fields
views: Optional[int] = None
# Polymarket fields
volume: Optional[float] = None
liquidity: Optional[float] = None
def to_dict(self) -> Dict[str, Any]:
d = {}
if self.score is not None:
d['score'] = self.score
if self.num_comments is not None:
d['num_comments'] = self.num_comments
if self.upvote_ratio is not None:
d['upvote_ratio'] = self.upvote_ratio
if self.likes is not None:
d['likes'] = self.likes
if self.reposts is not None:
d['reposts'] = self.reposts
if self.replies is not None:
d['replies'] = self.replies
if self.quotes is not None:
d['quotes'] = self.quotes
if self.views is not None:
d['views'] = self.views
if self.volume is not None:
d['volume'] = self.volume
if self.liquidity is not None:
d['liquidity'] = self.liquidity
return d if d else None
@dataclass
class Comment:
"""Reddit comment."""
score: int
date: Optional[str]
author: str
excerpt: str
url: str
def to_dict(self) -> Dict[str, Any]:
return {
'score': self.score,
'date': self.date,
'author': self.author,
'excerpt': self.excerpt,
'url': self.url,
}
@dataclass
class SubScores:
"""Component scores."""
relevance: int = 0
recency: int = 0
engagement: int = 0
def to_dict(self) -> Dict[str, int]:
return {
'relevance': self.relevance,
'recency': self.recency,
'engagement': self.engagement,
}
@dataclass
class RedditItem:
"""Normalized Reddit item."""
id: str
title: str
url: str
subreddit: str
date: Optional[str] = None
date_confidence: str = "low"
engagement: Optional[Engagement] = None
top_comments: List[Comment] = field(default_factory=list)
comment_insights: List[str] = field(default_factory=list)
relevance: float = 0.5
why_relevant: str = ""
subs: SubScores = field(default_factory=SubScores)
score: int = 0
cross_refs: List[str] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
d = {
'id': self.id,
'title': self.title,
'url': self.url,
'subreddit': self.subreddit,
'date': self.date,
'date_confidence': self.date_confidence,
'engagement': self.engagement.to_dict() if self.engagement else None,
'top_comments': [c.to_dict() for c in self.top_comments],
'comment_insights': self.comment_insights,
'relevance': self.relevance,
'why_relevant': self.why_relevant,
'subs': self.subs.to_dict(),
'score': self.score,
}
if self.cross_refs:
d['cross_refs'] = self.cross_refs
return d
@dataclass
class XItem:
"""Normalized X item."""
id: str
text: str
url: str
author_handle: str
date: Optional[str] = None
date_confidence: str = "low"
engagement: Optional[Engagement] = None
relevance: float = 0.5
why_relevant: str = ""
subs: SubScores = field(default_factory=SubScores)
score: int = 0
cross_refs: List[str] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
d = {
'id': self.id,
'text': self.text,
'url': self.url,
'author_handle': self.author_handle,
'date': self.date,
'date_confidence': self.date_confidence,
'engagement': self.engagement.to_dict() if self.engagement else None,
'relevance': self.relevance,
'why_relevant': self.why_relevant,
'subs': self.subs.to_dict(),
'score': self.score,
}
if self.cross_refs:
d['cross_refs'] = self.cross_refs
return d
@dataclass
class WebSearchItem:
"""Normalized web search item (no engagement metrics)."""
id: str
title: str
url: str
source_domain: str # e.g., "medium.com", "github.com"
snippet: str
date: Optional[str] = None
date_confidence: str = "low"
relevance: float = 0.5
why_relevant: str = ""
subs: SubScores = field(default_factory=SubScores)
score: int = 0
cross_refs: List[str] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
d = {
'id': self.id,
'title': self.title,
'url': self.url,
'source_domain': self.source_domain,
'snippet': self.snippet,
'date': self.date,
'date_confidence': self.date_confidence,
'relevance': self.relevance,
'why_relevant': self.why_relevant,
'subs': self.subs.to_dict(),
'score': self.score,
}
if self.cross_refs:
d['cross_refs'] = self.cross_refs
return d
@dataclass
class YouTubeItem:
"""Normalized YouTube item."""
id: str # video_id
title: str
url: str
channel_name: str
date: Optional[str] = None
date_confidence: str = "high" # YouTube dates are always reliable
engagement: Optional[Engagement] = None
transcript_snippet: str = ""
relevance: float = 0.7
why_relevant: str = ""
subs: SubScores = field(default_factory=SubScores)
score: int = 0
cross_refs: List[str] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
d = {
'id': self.id,
'title': self.title,
'url': self.url,
'channel_name': self.channel_name,
'date': self.date,
'date_confidence': self.date_confidence,
'engagement': self.engagement.to_dict() if self.engagement else None,
'transcript_snippet': self.transcript_snippet,
'relevance': self.relevance,
'why_relevant': self.why_relevant,
'subs': self.subs.to_dict(),
'score': self.score,
}
if self.cross_refs:
d['cross_refs'] = self.cross_refs
return d
@dataclass
class HackerNewsItem:
"""Normalized Hacker News item."""
id: str # "HN1", "HN2", ...
title: str
url: str # Original article URL
hn_url: str # news.ycombinator.com/item?id=...
author: str # HN username
date: Optional[str] = None
date_confidence: str = "high" # Algolia provides exact timestamps
engagement: Optional[Engagement] = None # points + num_comments
top_comments: List[Comment] = field(default_factory=list)
comment_insights: List[str] = field(default_factory=list)
relevance: float = 0.5
why_relevant: str = ""
subs: SubScores = field(default_factory=SubScores)
score: int = 0
cross_refs: List[str] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
d = {
'id': self.id,
'title': self.title,
'url': self.url,
'hn_url': self.hn_url,
'author': self.author,
'date': self.date,
'date_confidence': self.date_confidence,
'engagement': self.engagement.to_dict() if self.engagement else None,
'top_comments': [c.to_dict() for c in self.top_comments],
'comment_insights': self.comment_insights,
'relevance': self.relevance,
'why_relevant': self.why_relevant,
'subs': self.subs.to_dict(),
'score': self.score,
}
if self.cross_refs:
d['cross_refs'] = self.cross_refs
return d
@dataclass
class PolymarketItem:
"""Normalized Polymarket prediction market item."""
id: str # "PM1", "PM2", ...
title: str # Event title
question: str # Top market question
url: str # Event page URL
outcome_prices: List[tuple] = field(default_factory=list) # [(name, price), ...]
outcomes_remaining: int = 0
price_movement: Optional[str] = None # "down 11.7% this month"
date: Optional[str] = None
date_confidence: str = "high" # API provides exact timestamps
engagement: Optional[Engagement] = None # volume + liquidity
end_date: Optional[str] = None
relevance: float = 0.5
why_relevant: str = ""
subs: SubScores = field(default_factory=SubScores)
score: int = 0
cross_refs: List[str] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
d = {
'id': self.id,
'title': self.title,
'question': self.question,
'url': self.url,
'outcome_prices': self.outcome_prices,
'outcomes_remaining': self.outcomes_remaining,
'price_movement': self.price_movement,
'date': self.date,
'date_confidence': self.date_confidence,
'engagement': self.engagement.to_dict() if self.engagement else None,
'end_date': self.end_date,
'relevance': self.relevance,
'why_relevant': self.why_relevant,
'subs': self.subs.to_dict(),
'score': self.score,
}
if self.cross_refs:
d['cross_refs'] = self.cross_refs
return d
@dataclass
class Report:
"""Full research report."""
topic: str
range_from: str
range_to: str
generated_at: str
mode: str # 'reddit-only', 'x-only', 'both', 'web-only', etc.
openai_model_used: Optional[str] = None
xai_model_used: Optional[str] = None
reddit: List[RedditItem] = field(default_factory=list)
x: List[XItem] = field(default_factory=list)
web: List[WebSearchItem] = field(default_factory=list)
youtube: List[YouTubeItem] = field(default_factory=list)
hackernews: List[HackerNewsItem] = field(default_factory=list)
polymarket: List[PolymarketItem] = field(default_factory=list)
best_practices: List[str] = field(default_factory=list)
prompt_pack: List[str] = field(default_factory=list)
context_snippet_md: str = ""
# Status tracking
reddit_error: Optional[str] = None
x_error: Optional[str] = None
web_error: Optional[str] = None
youtube_error: Optional[str] = None
hackernews_error: Optional[str] = None
polymarket_error: Optional[str] = None
# Handle resolution
resolved_x_handle: Optional[str] = None
# Cache info
from_cache: bool = False
cache_age_hours: Optional[float] = None
def to_dict(self) -> Dict[str, Any]:
d = {
'topic': self.topic,
'range': {
'from': self.range_from,
'to': self.range_to,
},
'generated_at': self.generated_at,
'mode': self.mode,
'openai_model_used': self.openai_model_used,
'xai_model_used': self.xai_model_used,
'reddit': [r.to_dict() for r in self.reddit],
'x': [x.to_dict() for x in self.x],
'web': [w.to_dict() for w in self.web],
'youtube': [y.to_dict() for y in self.youtube],
'hackernews': [h.to_dict() for h in self.hackernews],
'polymarket': [p.to_dict() for p in self.polymarket],
'best_practices': self.best_practices,
'prompt_pack': self.prompt_pack,
'context_snippet_md': self.context_snippet_md,
}
if self.resolved_x_handle:
d['resolved_x_handle'] = self.resolved_x_handle
if self.reddit_error:
d['reddit_error'] = self.reddit_error
if self.x_error:
d['x_error'] = self.x_error
if self.web_error:
d['web_error'] = self.web_error
if self.youtube_error:
d['youtube_error'] = self.youtube_error
if self.hackernews_error:
d['hackernews_error'] = self.hackernews_error
if self.polymarket_error:
d['polymarket_error'] = self.polymarket_error
if self.from_cache:
d['from_cache'] = self.from_cache
if self.cache_age_hours is not None:
d['cache_age_hours'] = self.cache_age_hours
return d
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "Report":
"""Create Report from serialized dict (handles cache format)."""
# Handle range field conversion
range_data = data.get('range', {})
range_from = range_data.get('from', data.get('range_from', ''))
range_to = range_data.get('to', data.get('range_to', ''))
# Reconstruct Reddit items
reddit_items = []
for r in data.get('reddit', []):
eng = None
if r.get('engagement'):
eng = Engagement(**r['engagement'])
comments = [Comment(**c) for c in r.get('top_comments', [])]
subs = SubScores(**r.get('subs', {})) if r.get('subs') else SubScores()
reddit_items.append(RedditItem(
id=r['id'],
title=r['title'],
url=r['url'],
subreddit=r['subreddit'],
date=r.get('date'),
date_confidence=r.get('date_confidence', 'low'),
engagement=eng,
top_comments=comments,
comment_insights=r.get('comment_insights', []),
relevance=r.get('relevance', 0.5),
why_relevant=r.get('why_relevant', ''),
subs=subs,
score=r.get('score', 0),
cross_refs=r.get('cross_refs', []),
))
# Reconstruct X items
x_items = []
for x in data.get('x', []):
eng = None
if x.get('engagement'):
eng = Engagement(**x['engagement'])
subs = SubScores(**x.get('subs', {})) if x.get('subs') else SubScores()
x_items.append(XItem(
id=x['id'],
text=x['text'],
url=x['url'],
author_handle=x['author_handle'],
date=x.get('date'),
date_confidence=x.get('date_confidence', 'low'),
engagement=eng,
relevance=x.get('relevance', 0.5),
why_relevant=x.get('why_relevant', ''),
subs=subs,
score=x.get('score', 0),
cross_refs=x.get('cross_refs', []),
))
# Reconstruct Web items
web_items = []
for w in data.get('web', []):
subs = SubScores(**w.get('subs', {})) if w.get('subs') else SubScores()
web_items.append(WebSearchItem(
id=w['id'],
title=w['title'],
url=w['url'],
source_domain=w.get('source_domain', ''),
snippet=w.get('snippet', ''),
date=w.get('date'),
date_confidence=w.get('date_confidence', 'low'),
relevance=w.get('relevance', 0.5),
why_relevant=w.get('why_relevant', ''),
subs=subs,
score=w.get('score', 0),
cross_refs=w.get('cross_refs', []),
))
# Reconstruct YouTube items
youtube_items = []
for y in data.get('youtube', []):
eng = None
if y.get('engagement'):
eng = Engagement(**y['engagement'])
subs = SubScores(**y.get('subs', {})) if y.get('subs') else SubScores()
youtube_items.append(YouTubeItem(
id=y['id'],
title=y['title'],
url=y['url'],
channel_name=y.get('channel_name', ''),
date=y.get('date'),
date_confidence=y.get('date_confidence', 'high'),
engagement=eng,
transcript_snippet=y.get('transcript_snippet', ''),
relevance=y.get('relevance', 0.7),
why_relevant=y.get('why_relevant', ''),
subs=subs,
score=y.get('score', 0),
cross_refs=y.get('cross_refs', []),
))
# Reconstruct HackerNews items
hn_items = []
for h in data.get('hackernews', []):
eng = None
if h.get('engagement'):
eng = Engagement(**h['engagement'])
comments = [Comment(**c) for c in h.get('top_comments', [])]
subs = SubScores(**h.get('subs', {})) if h.get('subs') else SubScores()
hn_items.append(HackerNewsItem(
id=h['id'],
title=h['title'],
url=h.get('url', ''),
hn_url=h.get('hn_url', ''),
author=h.get('author', ''),
date=h.get('date'),
date_confidence=h.get('date_confidence', 'high'),
engagement=eng,
top_comments=comments,
comment_insights=h.get('comment_insights', []),
relevance=h.get('relevance', 0.5),
why_relevant=h.get('why_relevant', ''),
subs=subs,
score=h.get('score', 0),
cross_refs=h.get('cross_refs', []),
))
# Reconstruct Polymarket items (backward compat: key may not exist)
pm_items = []
for p in data.get('polymarket', []):
eng = None
if p.get('engagement'):
eng = Engagement(**p['engagement'])
subs = SubScores(**p.get('subs', {})) if p.get('subs') else SubScores()
pm_items.append(PolymarketItem(
id=p['id'],
title=p['title'],
question=p.get('question', ''),
url=p['url'],
outcome_prices=p.get('outcome_prices', []),
outcomes_remaining=p.get('outcomes_remaining', 0),
price_movement=p.get('price_movement'),
date=p.get('date'),
date_confidence=p.get('date_confidence', 'high'),
engagement=eng,
end_date=p.get('end_date'),
relevance=p.get('relevance', 0.5),
why_relevant=p.get('why_relevant', ''),
subs=subs,
score=p.get('score', 0),
cross_refs=p.get('cross_refs', []),
))
return cls(
topic=data['topic'],
range_from=range_from,
range_to=range_to,
generated_at=data['generated_at'],
mode=data['mode'],
openai_model_used=data.get('openai_model_used'),
xai_model_used=data.get('xai_model_used'),
reddit=reddit_items,
x=x_items,
web=web_items,
youtube=youtube_items,
hackernews=hn_items,
polymarket=pm_items,
best_practices=data.get('best_practices', []),
prompt_pack=data.get('prompt_pack', []),
context_snippet_md=data.get('context_snippet_md', ''),
reddit_error=data.get('reddit_error'),
x_error=data.get('x_error'),
web_error=data.get('web_error'),
youtube_error=data.get('youtube_error'),
hackernews_error=data.get('hackernews_error'),
polymarket_error=data.get('polymarket_error'),
resolved_x_handle=data.get('resolved_x_handle'),
from_cache=data.get('from_cache', False),
cache_age_hours=data.get('cache_age_hours'),
)
def create_report(
topic: str,
from_date: str,
to_date: str,
mode: str,
openai_model: Optional[str] = None,
xai_model: Optional[str] = None,
) -> Report:
"""Create a new report with metadata."""
return Report(
topic=topic,
range_from=from_date,
range_to=to_date,
generated_at=datetime.now(timezone.utc).isoformat(),
mode=mode,
openai_model_used=openai_model,
xai_model_used=xai_model,
)

View File

@@ -0,0 +1,492 @@
"""Popularity-aware scoring for last30days skill."""
import math
from typing import List, Optional, Union
from . import dates, schema
# Score weights for Reddit/X (has engagement)
WEIGHT_RELEVANCE = 0.45
WEIGHT_RECENCY = 0.25
WEIGHT_ENGAGEMENT = 0.30
# WebSearch weights (no engagement, reweighted to 100%)
WEBSEARCH_WEIGHT_RELEVANCE = 0.55
WEBSEARCH_WEIGHT_RECENCY = 0.45
WEBSEARCH_SOURCE_PENALTY = 15 # Points deducted for lacking engagement
# WebSearch date confidence adjustments
WEBSEARCH_VERIFIED_BONUS = 10 # Bonus for URL-verified recent date (high confidence)
WEBSEARCH_NO_DATE_PENALTY = 20 # Heavy penalty for no date signals (low confidence)
# Default engagement score for unknown
DEFAULT_ENGAGEMENT = 35
UNKNOWN_ENGAGEMENT_PENALTY = 3
def log1p_safe(x: Optional[int]) -> float:
"""Safe log1p that handles None and negative values."""
if x is None or x < 0:
return 0.0
return math.log1p(x)
def compute_reddit_engagement_raw(engagement: Optional[schema.Engagement]) -> Optional[float]:
"""Compute raw engagement score for Reddit item.
Formula: 0.55*log1p(score) + 0.40*log1p(num_comments) + 0.05*(upvote_ratio*10)
"""
if engagement is None:
return None
if engagement.score is None and engagement.num_comments is None:
return None
score = log1p_safe(engagement.score)
comments = log1p_safe(engagement.num_comments)
ratio = (engagement.upvote_ratio or 0.5) * 10
return 0.55 * score + 0.40 * comments + 0.05 * ratio
def compute_x_engagement_raw(engagement: Optional[schema.Engagement]) -> Optional[float]:
"""Compute raw engagement score for X item.
Formula: 0.55*log1p(likes) + 0.25*log1p(reposts) + 0.15*log1p(replies) + 0.05*log1p(quotes)
"""
if engagement is None:
return None
if engagement.likes is None and engagement.reposts is None:
return None
likes = log1p_safe(engagement.likes)
reposts = log1p_safe(engagement.reposts)
replies = log1p_safe(engagement.replies)
quotes = log1p_safe(engagement.quotes)
return 0.55 * likes + 0.25 * reposts + 0.15 * replies + 0.05 * quotes
def normalize_to_100(values: List[float], default: float = 50) -> List[float]:
"""Normalize a list of values to 0-100 scale.
Args:
values: Raw values (None values are preserved)
default: Default value for None entries
Returns:
Normalized values
"""
# Filter out None
valid = [v for v in values if v is not None]
if not valid:
return [default if v is None else 50 for v in values]
min_val = min(valid)
max_val = max(valid)
range_val = max_val - min_val
if range_val == 0:
return [50 if v is None else 50 for v in values]
result = []
for v in values:
if v is None:
result.append(None)
else:
normalized = ((v - min_val) / range_val) * 100
result.append(normalized)
return result
def score_reddit_items(items: List[schema.RedditItem]) -> List[schema.RedditItem]:
"""Compute scores for Reddit items.
Args:
items: List of Reddit items
Returns:
Items with updated scores
"""
if not items:
return items
# Compute raw engagement scores
eng_raw = [compute_reddit_engagement_raw(item.engagement) for item in items]
# Normalize engagement to 0-100
eng_normalized = normalize_to_100(eng_raw)
for i, item in enumerate(items):
# Relevance subscore (model-provided, convert to 0-100)
rel_score = int(item.relevance * 100)
# Recency subscore
rec_score = dates.recency_score(item.date)
# Engagement subscore
if eng_normalized[i] is not None:
eng_score = int(eng_normalized[i])
else:
eng_score = DEFAULT_ENGAGEMENT
# Store subscores
item.subs = schema.SubScores(
relevance=rel_score,
recency=rec_score,
engagement=eng_score,
)
# Compute overall score
overall = (
WEIGHT_RELEVANCE * rel_score +
WEIGHT_RECENCY * rec_score +
WEIGHT_ENGAGEMENT * eng_score
)
# Apply penalty for unknown engagement
if eng_raw[i] is None:
overall -= UNKNOWN_ENGAGEMENT_PENALTY
# Apply penalty for low date confidence
if item.date_confidence == "low":
overall -= 5
elif item.date_confidence == "med":
overall -= 2
item.score = max(0, min(100, int(overall)))
return items
def score_x_items(items: List[schema.XItem]) -> List[schema.XItem]:
"""Compute scores for X items.
Args:
items: List of X items
Returns:
Items with updated scores
"""
if not items:
return items
# Compute raw engagement scores
eng_raw = [compute_x_engagement_raw(item.engagement) for item in items]
# Normalize engagement to 0-100
eng_normalized = normalize_to_100(eng_raw)
for i, item in enumerate(items):
# Relevance subscore (model-provided, convert to 0-100)
rel_score = int(item.relevance * 100)
# Recency subscore
rec_score = dates.recency_score(item.date)
# Engagement subscore
if eng_normalized[i] is not None:
eng_score = int(eng_normalized[i])
else:
eng_score = DEFAULT_ENGAGEMENT
# Store subscores
item.subs = schema.SubScores(
relevance=rel_score,
recency=rec_score,
engagement=eng_score,
)
# Compute overall score
overall = (
WEIGHT_RELEVANCE * rel_score +
WEIGHT_RECENCY * rec_score +
WEIGHT_ENGAGEMENT * eng_score
)
# Apply penalty for unknown engagement
if eng_raw[i] is None:
overall -= UNKNOWN_ENGAGEMENT_PENALTY
# Apply penalty for low date confidence
if item.date_confidence == "low":
overall -= 5
elif item.date_confidence == "med":
overall -= 2
item.score = max(0, min(100, int(overall)))
return items
def compute_youtube_engagement_raw(engagement: Optional[schema.Engagement]) -> Optional[float]:
"""Compute raw engagement score for YouTube item.
Formula: 0.50*log1p(views) + 0.35*log1p(likes) + 0.15*log1p(comments)
Views dominate on YouTube — they're the primary discovery signal.
"""
if engagement is None:
return None
if engagement.views is None and engagement.likes is None:
return None
views = log1p_safe(engagement.views)
likes = log1p_safe(engagement.likes)
comments = log1p_safe(engagement.num_comments)
return 0.50 * views + 0.35 * likes + 0.15 * comments
def score_youtube_items(items: List[schema.YouTubeItem]) -> List[schema.YouTubeItem]:
"""Compute scores for YouTube items.
Uses same weight structure as Reddit/X (relevance + recency + engagement).
"""
if not items:
return items
eng_raw = [compute_youtube_engagement_raw(item.engagement) for item in items]
eng_normalized = normalize_to_100(eng_raw)
for i, item in enumerate(items):
rel_score = int(item.relevance * 100)
rec_score = dates.recency_score(item.date)
if eng_normalized[i] is not None:
eng_score = int(eng_normalized[i])
else:
eng_score = DEFAULT_ENGAGEMENT
item.subs = schema.SubScores(
relevance=rel_score,
recency=rec_score,
engagement=eng_score,
)
overall = (
WEIGHT_RELEVANCE * rel_score +
WEIGHT_RECENCY * rec_score +
WEIGHT_ENGAGEMENT * eng_score
)
if eng_raw[i] is None:
overall -= UNKNOWN_ENGAGEMENT_PENALTY
item.score = max(0, min(100, int(overall)))
return items
def compute_hackernews_engagement_raw(engagement: Optional[schema.Engagement]) -> Optional[float]:
"""Compute raw engagement score for Hacker News item.
Formula: 0.55*log1p(points) + 0.45*log1p(num_comments)
Points are the primary signal on HN; comments indicate depth of discussion.
"""
if engagement is None:
return None
if engagement.score is None and engagement.num_comments is None:
return None
points = log1p_safe(engagement.score)
comments = log1p_safe(engagement.num_comments)
return 0.55 * points + 0.45 * comments
def score_hackernews_items(items: List[schema.HackerNewsItem]) -> List[schema.HackerNewsItem]:
"""Compute scores for Hacker News items.
Uses same weight structure as Reddit/X (relevance + recency + engagement).
"""
if not items:
return items
eng_raw = [compute_hackernews_engagement_raw(item.engagement) for item in items]
eng_normalized = normalize_to_100(eng_raw)
for i, item in enumerate(items):
rel_score = int(item.relevance * 100)
rec_score = dates.recency_score(item.date)
if eng_normalized[i] is not None:
eng_score = int(eng_normalized[i])
else:
eng_score = DEFAULT_ENGAGEMENT
item.subs = schema.SubScores(
relevance=rel_score,
recency=rec_score,
engagement=eng_score,
)
overall = (
WEIGHT_RELEVANCE * rel_score +
WEIGHT_RECENCY * rec_score +
WEIGHT_ENGAGEMENT * eng_score
)
if eng_raw[i] is None:
overall -= UNKNOWN_ENGAGEMENT_PENALTY
item.score = max(0, min(100, int(overall)))
return items
def compute_polymarket_engagement_raw(engagement: Optional[schema.Engagement]) -> Optional[float]:
"""Compute raw engagement score for Polymarket item.
Formula: 0.60*log1p(volume) + 0.40*log1p(liquidity)
Volume is the primary signal (money flowing); liquidity indicates market depth.
"""
if engagement is None:
return None
if engagement.volume is None and engagement.liquidity is None:
return None
volume = math.log1p(engagement.volume or 0)
liquidity = math.log1p(engagement.liquidity or 0)
return 0.60 * volume + 0.40 * liquidity
def score_polymarket_items(items: List[schema.PolymarketItem]) -> List[schema.PolymarketItem]:
"""Compute scores for Polymarket items.
Uses same weight structure as Reddit/X (relevance + recency + engagement).
"""
if not items:
return items
eng_raw = [compute_polymarket_engagement_raw(item.engagement) for item in items]
eng_normalized = normalize_to_100(eng_raw)
for i, item in enumerate(items):
rel_score = int(item.relevance * 100)
rec_score = dates.recency_score(item.date)
if eng_normalized[i] is not None:
eng_score = int(eng_normalized[i])
else:
eng_score = DEFAULT_ENGAGEMENT
item.subs = schema.SubScores(
relevance=rel_score,
recency=rec_score,
engagement=eng_score,
)
overall = (
WEIGHT_RELEVANCE * rel_score +
WEIGHT_RECENCY * rec_score +
WEIGHT_ENGAGEMENT * eng_score
)
if eng_raw[i] is None:
overall -= UNKNOWN_ENGAGEMENT_PENALTY
item.score = max(0, min(100, int(overall)))
return items
def score_websearch_items(items: List[schema.WebSearchItem]) -> List[schema.WebSearchItem]:
"""Compute scores for WebSearch items WITHOUT engagement metrics.
Uses reweighted formula: 55% relevance + 45% recency - 15pt source penalty.
This ensures WebSearch items rank below comparable Reddit/X items.
Date confidence adjustments:
- High confidence (URL-verified date): +10 bonus
- Med confidence (snippet-extracted date): no change
- Low confidence (no date signals): -20 penalty
Args:
items: List of WebSearch items
Returns:
Items with updated scores
"""
if not items:
return items
for item in items:
# Relevance subscore (model-provided, convert to 0-100)
rel_score = int(item.relevance * 100)
# Recency subscore
rec_score = dates.recency_score(item.date)
# Store subscores (engagement is 0 for WebSearch - no data)
item.subs = schema.SubScores(
relevance=rel_score,
recency=rec_score,
engagement=0, # Explicitly zero - no engagement data available
)
# Compute overall score using WebSearch weights
overall = (
WEBSEARCH_WEIGHT_RELEVANCE * rel_score +
WEBSEARCH_WEIGHT_RECENCY * rec_score
)
# Apply source penalty (WebSearch < Reddit/X for same relevance/recency)
overall -= WEBSEARCH_SOURCE_PENALTY
# Apply date confidence adjustments
# High confidence (URL-verified): reward with bonus
# Med confidence (snippet-extracted): neutral
# Low confidence (no date signals): heavy penalty
if item.date_confidence == "high":
overall += WEBSEARCH_VERIFIED_BONUS # Reward verified recent dates
elif item.date_confidence == "low":
overall -= WEBSEARCH_NO_DATE_PENALTY # Heavy penalty for unknown
item.score = max(0, min(100, int(overall)))
return items
def sort_items(items: List[Union[schema.RedditItem, schema.XItem, schema.WebSearchItem, schema.YouTubeItem, schema.HackerNewsItem, schema.PolymarketItem]]) -> List:
"""Sort items by score (descending), then date, then source priority.
Args:
items: List of items to sort
Returns:
Sorted items
"""
def sort_key(item):
# Primary: score descending (negate for descending)
score = -item.score
# Secondary: date descending (recent first)
date = item.date or "0000-00-00"
date_key = -int(date.replace("-", ""))
# Tertiary: source priority (Reddit > X > YouTube > HN > Polymarket > WebSearch)
if isinstance(item, schema.RedditItem):
source_priority = 0
elif isinstance(item, schema.XItem):
source_priority = 1
elif isinstance(item, schema.YouTubeItem):
source_priority = 2
elif isinstance(item, schema.HackerNewsItem):
source_priority = 3
elif isinstance(item, schema.PolymarketItem):
source_priority = 4
else: # WebSearchItem
source_priority = 5
# Quaternary: title/text for stability
text = getattr(item, "title", "") or getattr(item, "text", "")
return (score, date_key, source_priority, text)
return sorted(items, key=sort_key)

View File

@@ -0,0 +1,137 @@
"""Tavily web search backend for last30days skill.
Uses Tavily Search API to find recent web content (blogs, docs, news, tutorials).
"""
import re
import sys
from datetime import datetime
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse
from . import http
ENDPOINT = "https://api.tavily.com/search"
# Domains to exclude (handled by Reddit/X search)
EXCLUDED_DOMAINS = {
"reddit.com", "www.reddit.com", "old.reddit.com",
"twitter.com", "www.twitter.com", "x.com", "www.x.com",
}
def search_web(
topic: str,
from_date: str,
to_date: str,
api_key: str,
depth: str = "default",
) -> List[Dict[str, Any]]:
"""Search the web via Tavily API."""
max_results = {"quick": 8, "default": 15, "deep": 25}.get(depth, 15)
search_depth = "basic" if depth == "quick" else "advanced"
payload = {
"api_key": api_key,
"query": (
f"{topic}. Focus on content published between {from_date} and {to_date}. "
f"Exclude reddit.com, x.com, and twitter.com."
),
"search_depth": search_depth,
"max_results": max_results,
"include_answer": False,
"include_raw_content": False,
"include_images": False,
}
sys.stderr.write(f"[Web] Searching Tavily for: {topic}\n")
sys.stderr.flush()
response = http.post(ENDPOINT, json_data=payload, timeout=30)
return _normalize_results(response)
def _normalize_results(response: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Convert Tavily response to websearch item schema."""
items: List[Dict[str, Any]] = []
results = response.get("results", [])
if not isinstance(results, list):
return items
for i, result in enumerate(results):
if not isinstance(result, dict):
continue
url = str(result.get("url", "")).strip()
if not url:
continue
domain = _extract_domain(url)
if not domain:
continue
if domain in EXCLUDED_DOMAINS:
continue
title = str(result.get("title", "")).strip()
snippet = str(result.get("content", result.get("snippet", ""))).strip()
if not title and not snippet:
continue
raw_date = result.get("published_date") or result.get("date")
date = _parse_date(raw_date)
date_confidence = "med" if date else "low"
score = result.get("score", result.get("relevance_score", 0.6))
try:
relevance = min(1.0, max(0.0, float(score)))
except (TypeError, ValueError):
relevance = 0.6
items.append({
"id": f"W{i+1}",
"title": title[:200],
"url": url,
"source_domain": domain,
"snippet": snippet[:500],
"date": date,
"date_confidence": date_confidence,
"relevance": relevance,
"why_relevant": "",
})
sys.stderr.write(f"[Web] Tavily: {len(items)} results\n")
sys.stderr.flush()
return items
def _extract_domain(url: str) -> str:
try:
domain = urlparse(url).netloc.lower()
if domain.startswith("www."):
domain = domain[4:]
return domain
except Exception:
return ""
def _parse_date(value: Any) -> Optional[str]:
"""Parse date to YYYY-MM-DD when possible."""
if not value:
return None
text = str(value).strip()
if not text:
return None
# ISO-like formats: 2026-03-03 or 2026-03-03T12:34:56Z
iso = re.search(r"(\d{4}-\d{2}-\d{2})", text)
if iso:
return iso.group(1)
# RFC2822-ish format fallback
for fmt in ("%a, %d %b %Y %H:%M:%S %Z", "%d %b %Y", "%b %d, %Y"):
try:
return datetime.strptime(text, fmt).strftime("%Y-%m-%d")
except ValueError:
continue
return None

View File

@@ -0,0 +1,495 @@
"""Terminal UI utilities for last30days skill."""
import sys
import time
import threading
import random
from typing import Optional
# Check if we're in a real terminal (not captured by Claude Code)
IS_TTY = sys.stderr.isatty()
# ANSI color codes
class Colors:
PURPLE = '\033[95m'
BLUE = '\033[94m'
CYAN = '\033[96m'
GREEN = '\033[92m'
YELLOW = '\033[93m'
RED = '\033[91m'
BOLD = '\033[1m'
DIM = '\033[2m'
RESET = '\033[0m'
BANNER = f"""{Colors.PURPLE}{Colors.BOLD}
██╗ █████╗ ███████╗████████╗██████╗ ██████╗ ██████╗ █████╗ ██╗ ██╗███████╗
██║ ██╔══██╗██╔════╝╚══██╔══╝╚════██╗██╔═████╗██╔══██╗██╔══██╗╚██╗ ██╔╝██╔════╝
██║ ███████║███████╗ ██║ █████╔╝██║██╔██║██║ ██║███████║ ╚████╔╝ ███████╗
██║ ██╔══██║╚════██║ ██║ ╚═══██╗████╔╝██║██║ ██║██╔══██║ ╚██╔╝ ╚════██║
███████╗██║ ██║███████║ ██║ ██████╔╝╚██████╔╝██████╔╝██║ ██║ ██║ ███████║
╚══════╝╚═╝ ╚═╝╚══════╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝ ╚══════╝
{Colors.RESET}{Colors.DIM} 30 days of research. 30 seconds of work.{Colors.RESET}
"""
MINI_BANNER = f"""{Colors.PURPLE}{Colors.BOLD}/last30days{Colors.RESET} {Colors.DIM}· researching...{Colors.RESET}"""
# Fun status messages for each phase
REDDIT_MESSAGES = [
"Diving into Reddit threads...",
"Scanning subreddits for gold...",
"Reading what Redditors are saying...",
"Exploring the front page of the internet...",
"Finding the good discussions...",
"Upvoting mentally...",
"Scrolling through comments...",
]
X_MESSAGES = [
"Checking what X is buzzing about...",
"Reading the timeline...",
"Finding the hot takes...",
"Scanning tweets and threads...",
"Discovering trending insights...",
"Following the conversation...",
"Reading between the posts...",
]
ENRICHING_MESSAGES = [
"Getting the juicy details...",
"Fetching engagement metrics...",
"Reading top comments...",
"Extracting insights...",
"Analyzing discussions...",
]
YOUTUBE_MESSAGES = [
"Searching YouTube for videos...",
"Finding relevant video content...",
"Scanning YouTube channels...",
"Discovering video discussions...",
"Fetching transcripts...",
]
HN_MESSAGES = [
"Searching Hacker News...",
"Scanning HN front page stories...",
"Finding technical discussions...",
"Discovering developer conversations...",
]
POLYMARKET_MESSAGES = [
"Checking prediction markets...",
"Finding what people are betting on...",
"Scanning Polymarket for odds...",
"Discovering prediction markets...",
]
PROCESSING_MESSAGES = [
"Crunching the data...",
"Scoring and ranking...",
"Finding patterns...",
"Removing duplicates...",
"Organizing findings...",
]
WEB_ONLY_MESSAGES = [
"Searching the web...",
"Finding blogs and docs...",
"Crawling news sites...",
"Discovering tutorials...",
]
def _build_nux_message(diag: dict = None) -> str:
"""Build conversational NUX message with dynamic source status."""
if diag:
reddit = "" if diag.get("openai") else ""
x = "" if diag.get("x_source") else ""
youtube = "" if diag.get("youtube") else ""
web = "" if diag.get("web_search_backend") else ""
status_line = f"Reddit {reddit}, X {x}, YouTube {youtube}, Web {web}"
else:
status_line = "YouTube ✓, Web ✓, Reddit ✗, X ✗"
return f"""
I just researched that for you. Here's what I've got right now:
{status_line}
You can unlock more sources with API keys or by signing in to Codex — just ask me how and I'll walk you through it. More sources means better research, but it works fine as-is.
Some examples of what you can do:
- "last30 what are people saying about Figma"
- "last30 watch my biggest competitor every week"
- "last30 watch Peter Steinberger every 30 days"
- "last30 watch AI video tools monthly"
- "last30 what have you found about AI video?"
Just start with "last30" and talk to me like normal.
"""
# Shorter promo for single missing key
PROMO_SINGLE_KEY = {
"reddit": "\n💡 You can unlock Reddit with an OpenAI API key or by running `codex login` — just ask me how.\n",
"x": "\n💡 You can unlock X with an xAI API key — just ask me how.\n",
}
# Bird auth help (for local users with vendored Bird CLI)
BIRD_AUTH_HELP = f"""
{Colors.YELLOW}Bird authentication failed.{Colors.RESET}
To fix this:
1. Log into X (twitter.com) in Safari, Chrome, or Firefox
2. Try again — Bird reads your browser cookies automatically.
"""
BIRD_AUTH_HELP_PLAIN = """
Bird authentication failed.
To fix this:
1. Log into X (twitter.com) in Safari, Chrome, or Firefox
2. Try again — Bird reads your browser cookies automatically.
"""
# Spinner frames
SPINNER_FRAMES = ['', '', '', '', '', '', '', '', '', '']
DOTS_FRAMES = [' ', '. ', '.. ', '...']
class Spinner:
"""Animated spinner for long-running operations."""
def __init__(self, message: str = "Working", color: str = Colors.CYAN, quiet: bool = False):
self.message = message
self.color = color
self.running = False
self.thread: Optional[threading.Thread] = None
self.frame_idx = 0
self.shown_static = False
self.quiet = quiet # Suppress non-TTY start message (still shows ✓ completion)
def _spin(self):
while self.running:
frame = SPINNER_FRAMES[self.frame_idx % len(SPINNER_FRAMES)]
sys.stderr.write(f"\r{self.color}{frame}{Colors.RESET} {self.message} ")
sys.stderr.flush()
self.frame_idx += 1
time.sleep(0.08)
def start(self):
self.running = True
if IS_TTY:
# Real terminal - animate
self.thread = threading.Thread(target=self._spin, daemon=True)
self.thread.start()
else:
# Not a TTY (Claude Code) - just print once
if not self.shown_static and not self.quiet:
sys.stderr.write(f"{self.message}\n")
sys.stderr.flush()
self.shown_static = True
def update(self, message: str):
self.message = message
if not IS_TTY and not self.shown_static:
# Print update in non-TTY mode
sys.stderr.write(f"{message}\n")
sys.stderr.flush()
def stop(self, final_message: str = ""):
self.running = False
if self.thread:
self.thread.join(timeout=0.2)
if IS_TTY:
# Clear the line in real terminal
sys.stderr.write("\r" + " " * 80 + "\r")
if final_message:
sys.stderr.write(f"{final_message}\n")
sys.stderr.flush()
class ProgressDisplay:
"""Progress display for research phases."""
def __init__(self, topic: str, show_banner: bool = True):
self.topic = topic
self.spinner: Optional[Spinner] = None
self.start_time = time.time()
if show_banner:
self._show_banner()
def _show_banner(self):
if IS_TTY:
sys.stderr.write(MINI_BANNER + "\n")
sys.stderr.write(f"{Colors.DIM}Topic: {Colors.RESET}{Colors.BOLD}{self.topic}{Colors.RESET}\n\n")
else:
# Simple text for non-TTY
sys.stderr.write(f"/last30days · researching: {self.topic}\n")
sys.stderr.flush()
def start_reddit(self):
msg = random.choice(REDDIT_MESSAGES)
self.spinner = Spinner(f"{Colors.YELLOW}Reddit{Colors.RESET} {msg}", Colors.YELLOW)
self.spinner.start()
def end_reddit(self, count: int):
if self.spinner:
self.spinner.stop(f"{Colors.YELLOW}Reddit{Colors.RESET} Found {count} threads")
def start_reddit_enrich(self, current: int, total: int):
if self.spinner:
self.spinner.stop()
msg = random.choice(ENRICHING_MESSAGES)
self.spinner = Spinner(f"{Colors.YELLOW}Reddit{Colors.RESET} [{current}/{total}] {msg}", Colors.YELLOW)
self.spinner.start()
def update_reddit_enrich(self, current: int, total: int):
if self.spinner:
msg = random.choice(ENRICHING_MESSAGES)
self.spinner.update(f"{Colors.YELLOW}Reddit{Colors.RESET} [{current}/{total}] {msg}")
def end_reddit_enrich(self):
if self.spinner:
self.spinner.stop(f"{Colors.YELLOW}Reddit{Colors.RESET} Enriched with engagement data")
def start_x(self):
msg = random.choice(X_MESSAGES)
self.spinner = Spinner(f"{Colors.CYAN}X{Colors.RESET} {msg}", Colors.CYAN)
self.spinner.start()
def end_x(self, count: int):
if self.spinner:
self.spinner.stop(f"{Colors.CYAN}X{Colors.RESET} Found {count} posts")
def start_youtube(self):
msg = random.choice(YOUTUBE_MESSAGES)
self.spinner = Spinner(f"{Colors.RED}YouTube{Colors.RESET} {msg}", Colors.RED, quiet=True)
self.spinner.start()
def end_youtube(self, count: int):
if self.spinner:
self.spinner.stop(f"{Colors.RED}YouTube{Colors.RESET} Found {count} videos")
def start_hackernews(self):
msg = random.choice(HN_MESSAGES)
self.spinner = Spinner(f"{Colors.YELLOW}HN{Colors.RESET} {msg}", Colors.YELLOW, quiet=True)
self.spinner.start()
def end_hackernews(self, count: int):
if self.spinner:
self.spinner.stop(f"{Colors.YELLOW}HN{Colors.RESET} Found {count} stories")
def start_polymarket(self):
msg = random.choice(POLYMARKET_MESSAGES)
self.spinner = Spinner(f"{Colors.GREEN}Polymarket{Colors.RESET} {msg}", Colors.GREEN, quiet=True)
self.spinner.start()
def end_polymarket(self, count: int):
if self.spinner:
self.spinner.stop(f"{Colors.GREEN}Polymarket{Colors.RESET} Found {count} markets")
def start_processing(self):
msg = random.choice(PROCESSING_MESSAGES)
self.spinner = Spinner(f"{Colors.PURPLE}Processing{Colors.RESET} {msg}", Colors.PURPLE)
self.spinner.start()
def end_processing(self):
if self.spinner:
self.spinner.stop()
def show_complete(self, reddit_count: int, x_count: int, youtube_count: int = 0, hn_count: int = 0, pm_count: int = 0):
elapsed = time.time() - self.start_time
if IS_TTY:
sys.stderr.write(f"\n{Colors.GREEN}{Colors.BOLD}✓ Research complete{Colors.RESET} ")
sys.stderr.write(f"{Colors.DIM}({elapsed:.1f}s){Colors.RESET}\n")
sys.stderr.write(f" {Colors.YELLOW}Reddit:{Colors.RESET} {reddit_count} threads ")
sys.stderr.write(f"{Colors.CYAN}X:{Colors.RESET} {x_count} posts")
if youtube_count:
sys.stderr.write(f" {Colors.RED}YouTube:{Colors.RESET} {youtube_count} videos")
if hn_count:
sys.stderr.write(f" {Colors.YELLOW}HN:{Colors.RESET} {hn_count} stories")
if pm_count:
sys.stderr.write(f" {Colors.GREEN}Polymarket:{Colors.RESET} {pm_count} markets")
sys.stderr.write("\n\n")
else:
parts = [f"Reddit: {reddit_count} threads", f"X: {x_count} posts"]
if youtube_count:
parts.append(f"YouTube: {youtube_count} videos")
if hn_count:
parts.append(f"HN: {hn_count} stories")
if pm_count:
parts.append(f"Polymarket: {pm_count} markets")
sys.stderr.write(f"✓ Research complete ({elapsed:.1f}s) - {', '.join(parts)}\n")
sys.stderr.flush()
def show_cached(self, age_hours: float = None):
if age_hours is not None:
age_str = f" ({age_hours:.1f}h old)"
else:
age_str = ""
sys.stderr.write(f"{Colors.GREEN}{Colors.RESET} {Colors.DIM}Using cached results{age_str} - use --refresh for fresh data{Colors.RESET}\n\n")
sys.stderr.flush()
def show_error(self, message: str):
sys.stderr.write(f"{Colors.RED}✗ Error:{Colors.RESET} {message}\n")
sys.stderr.flush()
def start_web_only(self):
"""Show web-only mode indicator."""
msg = random.choice(WEB_ONLY_MESSAGES)
self.spinner = Spinner(f"{Colors.GREEN}Web{Colors.RESET} {msg}", Colors.GREEN)
self.spinner.start()
def end_web_only(self):
"""End web-only spinner."""
if self.spinner:
self.spinner.stop(f"{Colors.GREEN}Web{Colors.RESET} assistant will search the web")
def show_web_only_complete(self):
"""Show completion for web-only mode."""
elapsed = time.time() - self.start_time
if IS_TTY:
sys.stderr.write(f"\n{Colors.GREEN}{Colors.BOLD}✓ Ready for web search{Colors.RESET} ")
sys.stderr.write(f"{Colors.DIM}({elapsed:.1f}s){Colors.RESET}\n")
sys.stderr.write(f" {Colors.GREEN}Web:{Colors.RESET} assistant will search blogs, docs & news\n\n")
else:
sys.stderr.write(f"✓ Ready for web search ({elapsed:.1f}s)\n")
sys.stderr.flush()
def show_promo(self, missing: str = "both", diag: dict = None):
"""Show NUX / promotional message for missing API keys.
Args:
missing: 'both', 'all', 'reddit', or 'x' - which keys are missing
diag: Optional diagnostics dict for dynamic source status
"""
if missing in ("both", "all"):
sys.stderr.write(_build_nux_message(diag))
elif missing in PROMO_SINGLE_KEY:
sys.stderr.write(PROMO_SINGLE_KEY[missing])
sys.stderr.flush()
def show_bird_auth_help(self):
"""Show Bird authentication help."""
if IS_TTY:
sys.stderr.write(BIRD_AUTH_HELP)
else:
sys.stderr.write(BIRD_AUTH_HELP_PLAIN)
sys.stderr.flush()
def show_diagnostic_banner(diag: dict):
"""Show pre-flight source status banner when sources are missing.
Args:
diag: Dict from env diagnostics with keys:
openai, xai, x_source, bird_installed, bird_authenticated,
bird_username, youtube, web_search_backend
"""
has_openai = diag.get("openai", False)
has_x = diag.get("x_source") is not None
has_youtube = diag.get("youtube", False)
has_web = diag.get("web_search_backend") is not None
# If everything is available, no banner needed
if has_openai and has_x and has_youtube and has_web:
return
lines = []
if IS_TTY:
lines.append(f"{Colors.DIM}┌─────────────────────────────────────────────────────┐{Colors.RESET}")
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.BOLD}/last30days v2.1 — Source Status{Colors.RESET} {Colors.DIM}{Colors.RESET}")
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.DIM}{Colors.RESET}")
# Reddit
if has_openai:
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.GREEN}✅ Reddit{Colors.RESET} — OPENAI_API_KEY found {Colors.DIM}{Colors.RESET}")
else:
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.RED}❌ Reddit{Colors.RESET} — No OPENAI_API_KEY {Colors.DIM}{Colors.RESET}")
lines.append(f"{Colors.DIM}{Colors.RESET} └─ Add to ~/.config/last30days/.env {Colors.DIM}{Colors.RESET}")
# X/Twitter
if has_x:
source = diag.get("x_source", "")
username = diag.get("bird_username", "")
label = f"Bird ({username})" if source == "bird" and username else source.upper()
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.GREEN}✅ X/Twitter{Colors.RESET}{label} {Colors.DIM}{Colors.RESET}")
else:
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.RED}❌ X/Twitter{Colors.RESET} — No Bird CLI or XAI_API_KEY {Colors.DIM}{Colors.RESET}")
if diag.get("bird_installed"):
lines.append(f"{Colors.DIM}{Colors.RESET} └─ Bird installed but not authenticated {Colors.DIM}{Colors.RESET}")
lines.append(f"{Colors.DIM}{Colors.RESET} └─ Log into x.com in your browser, then retry {Colors.DIM}{Colors.RESET}")
else:
lines.append(f"{Colors.DIM}{Colors.RESET} └─ Needs Node.js 22+ (Bird is bundled) {Colors.DIM}{Colors.RESET}")
# YouTube
if has_youtube:
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.GREEN}✅ YouTube{Colors.RESET} — yt-dlp found {Colors.DIM}{Colors.RESET}")
else:
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.RED}❌ YouTube{Colors.RESET} — yt-dlp not installed {Colors.DIM}{Colors.RESET}")
lines.append(f"{Colors.DIM}{Colors.RESET} └─ Fix: brew install yt-dlp (free) {Colors.DIM}{Colors.RESET}")
# Web
if has_web:
backend = diag.get("web_search_backend", "")
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.GREEN}✅ Web{Colors.RESET}{backend} API {Colors.DIM}{Colors.RESET}")
else:
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.YELLOW}⚡ Web{Colors.RESET} — Using assistant's search tool {Colors.DIM}{Colors.RESET}")
lines.append(f"{Colors.DIM}{Colors.RESET} {Colors.DIM}{Colors.RESET}")
lines.append(f"{Colors.DIM}{Colors.RESET} Config: {Colors.BOLD}~/.config/last30days/.env{Colors.RESET} {Colors.DIM}{Colors.RESET}")
lines.append(f"{Colors.DIM}└─────────────────────────────────────────────────────┘{Colors.RESET}")
else:
# Plain text for non-TTY (Claude Code / Codex)
lines.append("┌─────────────────────────────────────────────────────┐")
lines.append("│ /last30days v2.1 — Source Status │")
lines.append("│ │")
if has_openai:
lines.append("│ ✅ Reddit — OPENAI_API_KEY found │")
else:
lines.append("│ ❌ Reddit — No OPENAI_API_KEY │")
lines.append("│ └─ Add to ~/.config/last30days/.env │")
if has_x:
lines.append("│ ✅ X/Twitter — available │")
else:
lines.append("│ ❌ X/Twitter — No Bird CLI or XAI_API_KEY │")
if diag.get("bird_installed"):
lines.append("│ └─ Log into x.com in your browser, then retry │")
else:
lines.append("│ └─ Needs Node.js 22+ (Bird is bundled) │")
if has_youtube:
lines.append("│ ✅ YouTube — yt-dlp found │")
else:
lines.append("│ ❌ YouTube — yt-dlp not installed │")
lines.append("│ └─ Fix: brew install yt-dlp (free) │")
if has_web:
lines.append("│ ✅ Web — API search available │")
else:
lines.append("│ ⚡ Web — Using assistant's search tool │")
lines.append("│ │")
lines.append("│ Config: ~/.config/last30days/.env │")
lines.append("└─────────────────────────────────────────────────────┘")
sys.stderr.write("\n".join(lines) + "\n\n")
sys.stderr.flush()
def print_phase(phase: str, message: str):
"""Print a phase message."""
colors = {
"reddit": Colors.YELLOW,
"x": Colors.CYAN,
"process": Colors.PURPLE,
"done": Colors.GREEN,
"error": Colors.RED,
}
color = colors.get(phase, Colors.RESET)
sys.stderr.write(f"{color}{Colors.RESET} {message}\n")
sys.stderr.flush()

View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 Peter Steinberger
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,134 @@
#!/usr/bin/env node
/**
* bird-search.mjs - Vendored Bird CLI search wrapper for /last30days.
* Subset of @steipete/bird v0.8.0 (MIT License, Peter Steinberger).
*
* Usage:
* node bird-search.mjs <query> [--count N] [--json]
* node bird-search.mjs --whoami
* node bird-search.mjs --check
*/
import { resolveCredentials } from './lib/cookies.js';
import { TwitterClientBase } from './lib/twitter-client-base.js';
import { withSearch } from './lib/twitter-client-search.js';
// Build a search-only client (no posting, bookmarks, etc.)
const SearchClient = withSearch(TwitterClientBase);
const args = process.argv.slice(2);
// --check: verify that credentials can be resolved
if (args.includes('--check')) {
try {
const { cookies, warnings } = await resolveCredentials({});
if (cookies.authToken && cookies.ct0) {
process.stdout.write(JSON.stringify({ authenticated: true, source: cookies.source }));
process.exit(0);
} else {
process.stdout.write(JSON.stringify({ authenticated: false, warnings }));
process.exit(1);
}
} catch (err) {
process.stdout.write(JSON.stringify({ authenticated: false, error: err.message }));
process.exit(1);
}
}
// --whoami: check auth and output source
if (args.includes('--whoami')) {
try {
const { cookies } = await resolveCredentials({});
if (cookies.authToken && cookies.ct0) {
process.stdout.write(cookies.source || 'authenticated');
process.exit(0);
} else {
process.stderr.write('Not authenticated\n');
process.exit(1);
}
} catch (err) {
process.stderr.write(`Auth check failed: ${err.message}\n`);
process.exit(1);
}
}
// Parse search args
let query = null;
let count = 20;
let jsonOutput = false;
for (let i = 0; i < args.length; i++) {
if (args[i] === '--count' && args[i + 1]) {
count = parseInt(args[i + 1], 10);
i++;
} else if (args[i] === '-n' && args[i + 1]) {
count = parseInt(args[i + 1], 10);
i++;
} else if (args[i] === '--json') {
jsonOutput = true;
} else if (!args[i].startsWith('-')) {
query = args[i];
}
}
if (!query) {
process.stderr.write('Usage: node bird-search.mjs <query> [--count N] [--json]\n');
process.exit(1);
}
try {
// Resolve credentials (env vars, then browser cookies)
const { cookies, warnings } = await resolveCredentials({});
if (!cookies.authToken || !cookies.ct0) {
const msg = warnings.length > 0 ? warnings.join('; ') : 'No Twitter credentials found';
if (jsonOutput) {
process.stdout.write(JSON.stringify({ error: msg, items: [] }));
} else {
process.stderr.write(`Error: ${msg}\n`);
}
process.exit(1);
}
// Create search client
const client = new SearchClient({
cookies: {
authToken: cookies.authToken,
ct0: cookies.ct0,
cookieHeader: cookies.cookieHeader,
},
timeoutMs: 30000,
});
// Run search
const result = await client.search(query, count);
if (!result.success) {
if (jsonOutput) {
process.stdout.write(JSON.stringify({ error: result.error, items: [] }));
} else {
process.stderr.write(`Search failed: ${result.error}\n`);
}
process.exit(1);
}
// Output results
const tweets = result.tweets || [];
if (jsonOutput) {
process.stdout.write(JSON.stringify(tweets));
} else {
for (const tweet of tweets) {
const author = tweet.author?.username || 'unknown';
process.stdout.write(`@${author}: ${tweet.text?.slice(0, 200)}\n\n`);
}
}
process.exit(0);
} catch (err) {
if (jsonOutput) {
process.stdout.write(JSON.stringify({ error: err.message, items: [] }));
} else {
process.stderr.write(`Error: ${err.message}\n`);
}
process.exit(1);
}

View File

@@ -0,0 +1,173 @@
/**
* Browser cookie extraction for Twitter authentication.
* Delegates to @steipete/sweet-cookie for Safari/Chrome/Firefox reads.
*/
import { getCookies } from '@steipete/sweet-cookie';
const TWITTER_COOKIE_NAMES = ['auth_token', 'ct0'];
const TWITTER_URL = 'https://x.com/';
const TWITTER_ORIGINS = ['https://x.com/', 'https://twitter.com/'];
const DEFAULT_COOKIE_TIMEOUT_MS = 30_000;
function normalizeValue(value) {
if (typeof value !== 'string') {
return null;
}
const trimmed = value.trim();
return trimmed.length > 0 ? trimmed : null;
}
function cookieHeader(authToken, ct0) {
return `auth_token=${authToken}; ct0=${ct0}`;
}
function buildEmpty() {
return { authToken: null, ct0: null, cookieHeader: null, source: null };
}
function readEnvCookie(cookies, keys, field) {
if (cookies[field]) {
return;
}
for (const key of keys) {
const value = normalizeValue(process.env[key]);
if (!value) {
continue;
}
cookies[field] = value;
if (!cookies.source) {
cookies.source = `env ${key}`;
}
break;
}
}
function resolveSources(cookieSource) {
if (Array.isArray(cookieSource)) {
return cookieSource;
}
if (cookieSource) {
return [cookieSource];
}
return ['safari', 'chrome', 'firefox'];
}
function labelForSource(source, profile) {
if (source === 'safari') {
return 'Safari';
}
if (source === 'chrome') {
return profile ? `Chrome profile "${profile}"` : 'Chrome default profile';
}
return profile ? `Firefox profile "${profile}"` : 'Firefox default profile';
}
function pickCookieValue(cookies, name) {
const matches = cookies.filter((c) => c?.name === name && typeof c.value === 'string');
if (matches.length === 0) {
return null;
}
const preferred = matches.find((c) => (c.domain ?? '').endsWith('x.com'));
if (preferred?.value) {
return preferred.value;
}
const twitter = matches.find((c) => (c.domain ?? '').endsWith('twitter.com'));
if (twitter?.value) {
return twitter.value;
}
return matches[0]?.value ?? null;
}
async function readTwitterCookiesFromBrowser(options) {
const warnings = [];
const out = buildEmpty();
const { cookies, warnings: providerWarnings } = await getCookies({
url: TWITTER_URL,
origins: TWITTER_ORIGINS,
names: [...TWITTER_COOKIE_NAMES],
browsers: [options.source],
mode: 'merge',
chromeProfile: options.chromeProfile,
firefoxProfile: options.firefoxProfile,
timeoutMs: options.cookieTimeoutMs,
});
warnings.push(...providerWarnings);
const authToken = pickCookieValue(cookies, 'auth_token');
const ct0 = pickCookieValue(cookies, 'ct0');
if (authToken) {
out.authToken = authToken;
}
if (ct0) {
out.ct0 = ct0;
}
if (out.authToken && out.ct0) {
out.cookieHeader = cookieHeader(out.authToken, out.ct0);
out.source = labelForSource(options.source, options.source === 'chrome' ? options.chromeProfile : options.firefoxProfile);
return { cookies: out, warnings };
}
if (options.source === 'safari') {
warnings.push('No Twitter cookies found in Safari. Make sure you are logged into x.com in Safari.');
}
else if (options.source === 'chrome') {
warnings.push('No Twitter cookies found in Chrome. Make sure you are logged into x.com in Chrome.');
}
else {
warnings.push('No Twitter cookies found in Firefox. Make sure you are logged into x.com in Firefox and the profile exists.');
}
return { cookies: out, warnings };
}
export async function extractCookiesFromSafari() {
return readTwitterCookiesFromBrowser({ source: 'safari' });
}
export async function extractCookiesFromChrome(profile) {
return readTwitterCookiesFromBrowser({ source: 'chrome', chromeProfile: profile });
}
export async function extractCookiesFromFirefox(profile) {
return readTwitterCookiesFromBrowser({ source: 'firefox', firefoxProfile: profile });
}
/**
* Resolve Twitter credentials from multiple sources.
* Priority: CLI args > environment variables > browsers (ordered).
*/
export async function resolveCredentials(options) {
const warnings = [];
const cookies = buildEmpty();
const cookieTimeoutMs = typeof options.cookieTimeoutMs === 'number' &&
Number.isFinite(options.cookieTimeoutMs) &&
options.cookieTimeoutMs > 0
? options.cookieTimeoutMs
: process.platform === 'darwin'
? DEFAULT_COOKIE_TIMEOUT_MS
: undefined;
if (options.authToken) {
cookies.authToken = options.authToken;
cookies.source = 'CLI argument';
}
if (options.ct0) {
cookies.ct0 = options.ct0;
if (!cookies.source) {
cookies.source = 'CLI argument';
}
}
readEnvCookie(cookies, ['AUTH_TOKEN', 'TWITTER_AUTH_TOKEN'], 'authToken');
readEnvCookie(cookies, ['CT0', 'TWITTER_CT0'], 'ct0');
if (cookies.authToken && cookies.ct0) {
cookies.cookieHeader = cookieHeader(cookies.authToken, cookies.ct0);
return { cookies, warnings };
}
const sourcesToTry = resolveSources(options.cookieSource);
for (const source of sourcesToTry) {
const res = await readTwitterCookiesFromBrowser({
source,
chromeProfile: options.chromeProfile,
firefoxProfile: options.firefoxProfile,
cookieTimeoutMs,
});
warnings.push(...res.warnings);
if (res.cookies.authToken && res.cookies.ct0) {
return { cookies: res.cookies, warnings };
}
}
if (!cookies.authToken) {
warnings.push('Missing auth_token - provide via --auth-token, AUTH_TOKEN env var, or login to x.com in Safari/Chrome/Firefox');
}
if (!cookies.ct0) {
warnings.push('Missing ct0 - provide via --ct0, CT0 env var, or login to x.com in Safari/Chrome/Firefox');
}
if (cookies.authToken && cookies.ct0) {
cookies.cookieHeader = cookieHeader(cookies.authToken, cookies.ct0);
}
return { cookies, warnings };
}
//# sourceMappingURL=cookies.js.map

View File

@@ -0,0 +1,17 @@
{
"global": {
"responsive_web_grok_annotations_enabled": false,
"post_ctas_fetch_enabled": true,
"responsive_web_graphql_exclude_directive_enabled": true
},
"sets": {
"lists": {
"blue_business_profile_image_shape_enabled": true,
"tweetypie_unmention_optimization_enabled": true,
"responsive_web_text_conversations_enabled": false,
"interactive_text_enabled": true,
"vibe_api_enabled": true,
"responsive_web_twitter_blue_verified_badge_is_enabled": true
}
}
}

View File

@@ -0,0 +1,37 @@
export async function paginateCursor(opts) {
const { maxPages, pageDelayMs = 1000 } = opts;
const seen = new Set();
const items = [];
let cursor = opts.cursor;
let pagesFetched = 0;
while (true) {
if (pagesFetched > 0 && pageDelayMs > 0) {
await opts.sleep(pageDelayMs);
}
const page = await opts.fetchPage(cursor);
if (!page.success) {
if (items.length > 0) {
return { success: false, error: page.error, items, nextCursor: cursor };
}
return page;
}
pagesFetched += 1;
for (const item of page.items) {
const key = opts.getKey(item);
if (seen.has(key)) {
continue;
}
seen.add(key);
items.push(item);
}
const pageCursor = page.cursor;
if (!pageCursor || pageCursor === cursor) {
return { success: true, items, nextCursor: undefined };
}
if (maxPages !== undefined && pagesFetched >= maxPages) {
return { success: true, items, nextCursor: pageCursor };
}
cursor = pageCursor;
}
}
//# sourceMappingURL=paginate-cursor.js.map

View File

@@ -0,0 +1,20 @@
{
"CreateTweet": "nmdAQXJDxw6-0KKF2on7eA",
"CreateRetweet": "LFho5rIi4xcKO90p9jwG7A",
"CreateFriendship": "8h9JVdV8dlSyqyRDJEPCsA",
"DestroyFriendship": "ppXWuagMNXgvzx6WoXBW0Q",
"FavoriteTweet": "lI07N6Otwv1PhnEgXILM7A",
"DeleteBookmark": "Wlmlj2-xzyS1GN3a6cj-mQ",
"TweetDetail": "_NvJCnIjOW__EP5-RF197A",
"SearchTimeline": "6AAys3t42mosm_yTI_QENg",
"Bookmarks": "RV1g3b8n_SGOHwkqKYSCFw",
"BookmarkFolderTimeline": "KJIQpsvxrTfRIlbaRIySHQ",
"Following": "mWYeougg_ocJS2Vr1Vt28w",
"Followers": "SFYY3WsgwjlXSLlfnEUE4A",
"Likes": "ETJflBunfqNa1uE1mBPCaw",
"ExploreSidebar": "lpSN4M6qpimkF4nRFPE3nQ",
"ExplorePage": "kheAINB_4pzRDqkzG3K-ng",
"GenericTimelineById": "uGSr7alSjR9v6QJAIaqSKQ",
"TrendHistory": "Sj4T-jSB9pr0Mxtsc1UKZQ",
"AboutAccountQuery": "zs_jFPFT78rBpXv9Z3U2YQ"
}

View File

@@ -0,0 +1,151 @@
import { existsSync, readFileSync } from 'node:fs';
import { mkdir, writeFile } from 'node:fs/promises';
import { homedir } from 'node:os';
import path from 'node:path';
// biome-ignore lint/correctness/useImportExtensions: JSON module import doesn't use .js extension.
import defaultOverrides from './features.json' with { type: 'json' };
const DEFAULT_CACHE_FILENAME = 'features.json';
let cachedOverrides = null;
function normalizeFeatureMap(value) {
if (!value || typeof value !== 'object' || Array.isArray(value)) {
return {};
}
const result = {};
for (const [key, entry] of Object.entries(value)) {
if (typeof entry === 'boolean') {
result[key] = entry;
}
}
return result;
}
function normalizeOverrides(value) {
if (!value || typeof value !== 'object' || Array.isArray(value)) {
return { global: {}, sets: {} };
}
const record = value;
const global = normalizeFeatureMap(record.global);
const sets = {};
const rawSets = record.sets && typeof record.sets === 'object' && !Array.isArray(record.sets)
? record.sets
: {};
for (const [setName, setValue] of Object.entries(rawSets)) {
const normalized = normalizeFeatureMap(setValue);
if (Object.keys(normalized).length > 0) {
sets[setName] = normalized;
}
}
return { global, sets };
}
function mergeOverrides(base, next) {
const sets = { ...base.sets };
for (const [setName, overrides] of Object.entries(next.sets)) {
const existing = sets[setName];
sets[setName] = existing ? { ...existing, ...overrides } : { ...overrides };
}
return {
global: { ...base.global, ...next.global },
sets,
};
}
function toFeatureOverrides(overrides) {
const result = {};
if (Object.keys(overrides.global).length > 0) {
result.global = overrides.global;
}
const setEntries = Object.entries(overrides.sets).filter(([, value]) => Object.keys(value).length > 0);
if (setEntries.length > 0) {
result.sets = Object.fromEntries(setEntries);
}
return result;
}
function resolveFeaturesCachePath() {
const override = process.env.BIRD_FEATURES_CACHE ?? process.env.BIRD_FEATURES_PATH;
if (override && override.trim().length > 0) {
return path.resolve(override.trim());
}
return path.join(homedir(), '.config', 'bird', DEFAULT_CACHE_FILENAME);
}
function readOverridesFromFile(cachePath) {
if (!existsSync(cachePath)) {
return null;
}
try {
const raw = readFileSync(cachePath, 'utf8');
return normalizeOverrides(JSON.parse(raw));
}
catch {
return null;
}
}
function readOverridesFromEnv() {
const raw = process.env.BIRD_FEATURES_JSON;
if (!raw || raw.trim().length === 0) {
return null;
}
try {
return normalizeOverrides(JSON.parse(raw));
}
catch {
return null;
}
}
function writeOverridesToDisk(cachePath, overrides) {
const payload = toFeatureOverrides(overrides);
return mkdir(path.dirname(cachePath), { recursive: true }).then(() => writeFile(cachePath, `${JSON.stringify(payload, null, 2)}\n`, 'utf8'));
}
export function loadFeatureOverrides() {
if (cachedOverrides) {
return cachedOverrides;
}
const base = normalizeOverrides(defaultOverrides);
const fromFile = readOverridesFromFile(resolveFeaturesCachePath());
const fromEnv = readOverridesFromEnv();
let merged = base;
if (fromFile) {
merged = mergeOverrides(merged, fromFile);
}
if (fromEnv) {
merged = mergeOverrides(merged, fromEnv);
}
cachedOverrides = merged;
return merged;
}
export function getFeatureOverridesSnapshot() {
const overrides = toFeatureOverrides(loadFeatureOverrides());
return {
cachePath: resolveFeaturesCachePath(),
overrides,
};
}
export function applyFeatureOverrides(setName, base) {
const overrides = loadFeatureOverrides();
const globalOverrides = overrides.global;
const setOverrides = overrides.sets[setName];
if (Object.keys(globalOverrides).length === 0 && (!setOverrides || Object.keys(setOverrides).length === 0)) {
return base;
}
if (setOverrides) {
return {
...base,
...globalOverrides,
...setOverrides,
};
}
return {
...base,
...globalOverrides,
};
}
export async function refreshFeatureOverridesCache() {
const cachePath = resolveFeaturesCachePath();
const base = normalizeOverrides(defaultOverrides);
const fromFile = readOverridesFromFile(cachePath);
const merged = mergeOverrides(base, fromFile ?? { global: {}, sets: {} });
await writeOverridesToDisk(cachePath, merged);
cachedOverrides = null;
return { cachePath, overrides: toFeatureOverrides(merged) };
}
export function clearFeatureOverridesCache() {
cachedOverrides = null;
}
//# sourceMappingURL=runtime-features.js.map

View File

@@ -0,0 +1,264 @@
import { mkdir, readFile, writeFile } from 'node:fs/promises';
import { homedir } from 'node:os';
import path from 'node:path';
const DEFAULT_CACHE_FILENAME = 'query-ids-cache.json';
const DEFAULT_TTL_MS = 24 * 60 * 60 * 1000;
const DISCOVERY_PAGES = [
'https://x.com/?lang=en',
'https://x.com/explore',
'https://x.com/notifications',
'https://x.com/settings/profile',
];
const BUNDLE_URL_REGEX = /https:\/\/abs\.twimg\.com\/responsive-web\/client-web(?:-legacy)?\/[A-Za-z0-9.-]+\.js/g;
const QUERY_ID_REGEX = /^[a-zA-Z0-9_-]+$/;
const OPERATION_PATTERNS = [
{
regex: /e\.exports=\{queryId\s*:\s*["']([^"']+)["']\s*,\s*operationName\s*:\s*["']([^"']+)["']/gs,
operationGroup: 2,
queryIdGroup: 1,
},
{
regex: /e\.exports=\{operationName\s*:\s*["']([^"']+)["']\s*,\s*queryId\s*:\s*["']([^"']+)["']/gs,
operationGroup: 1,
queryIdGroup: 2,
},
{
regex: /operationName\s*[:=]\s*["']([^"']+)["'](.{0,4000}?)queryId\s*[:=]\s*["']([^"']+)["']/gs,
operationGroup: 1,
queryIdGroup: 3,
},
{
regex: /queryId\s*[:=]\s*["']([^"']+)["'](.{0,4000}?)operationName\s*[:=]\s*["']([^"']+)["']/gs,
operationGroup: 3,
queryIdGroup: 1,
},
];
const HEADERS = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36',
Accept: 'text/html,application/json;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.9',
};
async function fetchText(fetchImpl, url) {
const response = await fetchImpl(url, { headers: HEADERS });
if (!response.ok) {
const body = await response.text().catch(() => '');
throw new Error(`HTTP ${response.status} for ${url}: ${body.slice(0, 120)}`);
}
return response.text();
}
function resolveDefaultCachePath() {
const override = process.env.BIRD_QUERY_IDS_CACHE;
if (override && override.trim().length > 0) {
return path.resolve(override.trim());
}
return path.join(homedir(), '.config', 'bird', DEFAULT_CACHE_FILENAME);
}
function parseSnapshot(raw) {
if (!raw || typeof raw !== 'object') {
return null;
}
const record = raw;
const fetchedAt = typeof record.fetchedAt === 'string' ? record.fetchedAt : null;
const ttlMs = typeof record.ttlMs === 'number' && Number.isFinite(record.ttlMs) ? record.ttlMs : null;
const ids = record.ids && typeof record.ids === 'object' ? record.ids : null;
const discovery = record.discovery && typeof record.discovery === 'object' ? record.discovery : null;
if (!fetchedAt || !ttlMs || !ids || !discovery) {
return null;
}
const pages = Array.isArray(discovery.pages) ? discovery.pages : null;
const bundles = Array.isArray(discovery.bundles) ? discovery.bundles : null;
if (!pages || !bundles) {
return null;
}
const normalizedIds = {};
for (const [key, value] of Object.entries(ids)) {
if (typeof value === 'string' && value.trim().length > 0) {
normalizedIds[key] = value.trim();
}
}
return {
fetchedAt,
ttlMs,
ids: normalizedIds,
discovery: {
pages: pages.filter((p) => typeof p === 'string'),
bundles: bundles.filter((b) => typeof b === 'string'),
},
};
}
async function readSnapshotFromDisk(cachePath) {
try {
const raw = await readFile(cachePath, 'utf8');
return parseSnapshot(JSON.parse(raw));
}
catch {
return null;
}
}
async function writeSnapshotToDisk(cachePath, snapshot) {
await mkdir(path.dirname(cachePath), { recursive: true });
await writeFile(cachePath, `${JSON.stringify(snapshot, null, 2)}\n`, 'utf8');
}
async function discoverBundles(fetchImpl) {
const bundles = new Set();
for (const page of DISCOVERY_PAGES) {
try {
const html = await fetchText(fetchImpl, page);
for (const match of html.matchAll(BUNDLE_URL_REGEX)) {
bundles.add(match[0]);
}
}
catch {
// ignore discovery page failures; other pages often work
}
}
const discovered = [...bundles];
if (discovered.length === 0) {
throw new Error('No client bundles discovered; x.com layout may have changed.');
}
return discovered;
}
function extractOperations(bundleContents, bundleLabel, targets, discovered) {
for (const pattern of OPERATION_PATTERNS) {
pattern.regex.lastIndex = 0;
while (true) {
const match = pattern.regex.exec(bundleContents);
if (match === null) {
break;
}
const operationName = match[pattern.operationGroup];
const queryId = match[pattern.queryIdGroup];
if (!operationName || !queryId) {
continue;
}
if (!targets.has(operationName)) {
continue;
}
if (!QUERY_ID_REGEX.test(queryId)) {
continue;
}
if (discovered.has(operationName)) {
continue;
}
discovered.set(operationName, { queryId, bundle: bundleLabel });
if (discovered.size === targets.size) {
return;
}
}
}
}
async function fetchAndExtract(fetchImpl, bundleUrls, targets) {
const discovered = new Map();
const CONCURRENCY = 6;
for (let i = 0; i < bundleUrls.length; i += CONCURRENCY) {
const chunk = bundleUrls.slice(i, i + CONCURRENCY);
await Promise.all(chunk.map(async (url) => {
if (discovered.size === targets.size) {
return;
}
const label = url.split('/').at(-1) ?? url;
try {
const js = await fetchText(fetchImpl, url);
extractOperations(js, label, targets, discovered);
}
catch {
// ignore failed bundles
}
}));
if (discovered.size === targets.size) {
break;
}
}
return discovered;
}
export function createRuntimeQueryIdStore(options = {}) {
const fetchImpl = options.fetchImpl ?? fetch;
const ttlMs = options.ttlMs ?? DEFAULT_TTL_MS;
const cachePath = options.cachePath ? path.resolve(options.cachePath) : resolveDefaultCachePath();
let memorySnapshot = null;
let loadOnce = null;
let refreshInFlight = null;
const loadSnapshot = async () => {
if (memorySnapshot) {
return memorySnapshot;
}
if (!loadOnce) {
loadOnce = (async () => {
const fromDisk = await readSnapshotFromDisk(cachePath);
memorySnapshot = fromDisk;
return fromDisk;
})();
}
return loadOnce;
};
const getSnapshotInfo = async () => {
const snapshot = await loadSnapshot();
if (!snapshot) {
return null;
}
const fetchedAtMs = new Date(snapshot.fetchedAt).getTime();
const ageMs = Number.isFinite(fetchedAtMs) ? Math.max(0, Date.now() - fetchedAtMs) : Number.POSITIVE_INFINITY;
const effectiveTtl = Number.isFinite(snapshot.ttlMs) ? snapshot.ttlMs : ttlMs;
const isFresh = ageMs <= effectiveTtl;
return { snapshot, cachePath, ageMs, isFresh };
};
const getQueryId = async (operationName) => {
const info = await getSnapshotInfo();
if (!info) {
return null;
}
return info.snapshot.ids[operationName] ?? null;
};
const refresh = async (operationNames, opts = {}) => {
if (refreshInFlight) {
return refreshInFlight;
}
refreshInFlight = (async () => {
const current = await getSnapshotInfo();
if (!opts.force && current?.isFresh) {
return current;
}
const targets = new Set(operationNames);
const bundleUrls = await discoverBundles(fetchImpl);
const discovered = await fetchAndExtract(fetchImpl, bundleUrls, targets);
if (discovered.size === 0) {
return current ?? null;
}
const ids = {};
for (const name of operationNames) {
const entry = discovered.get(name);
if (entry?.queryId) {
ids[name] = entry.queryId;
}
}
const snapshot = {
fetchedAt: new Date().toISOString(),
ttlMs,
ids,
discovery: {
pages: [...DISCOVERY_PAGES],
bundles: bundleUrls.map((url) => url.split('/').at(-1) ?? url),
},
};
await writeSnapshotToDisk(cachePath, snapshot);
memorySnapshot = snapshot;
return getSnapshotInfo();
})().finally(() => {
refreshInFlight = null;
});
return refreshInFlight;
};
return {
cachePath,
ttlMs,
getSnapshotInfo,
getQueryId,
refresh,
clearMemory() {
memorySnapshot = null;
loadOnce = null;
},
};
}
export const runtimeQueryIds = createRuntimeQueryIdStore();
//# sourceMappingURL=runtime-query-ids.js.map

View File

@@ -0,0 +1,129 @@
import { randomBytes, randomUUID } from 'node:crypto';
import { runtimeQueryIds } from './runtime-query-ids.js';
import { QUERY_IDS, TARGET_QUERY_ID_OPERATIONS } from './twitter-client-constants.js';
import { normalizeQuoteDepth } from './twitter-client-utils.js';
export class TwitterClientBase {
authToken;
ct0;
cookieHeader;
userAgent;
timeoutMs;
quoteDepth;
clientUuid;
clientDeviceId;
clientUserId;
constructor(options) {
if (!options.cookies.authToken || !options.cookies.ct0) {
throw new Error('Both authToken and ct0 cookies are required');
}
this.authToken = options.cookies.authToken;
this.ct0 = options.cookies.ct0;
this.cookieHeader = options.cookies.cookieHeader || `auth_token=${this.authToken}; ct0=${this.ct0}`;
this.userAgent =
options.userAgent ||
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36';
this.timeoutMs = options.timeoutMs;
this.quoteDepth = normalizeQuoteDepth(options.quoteDepth);
this.clientUuid = randomUUID();
this.clientDeviceId = randomUUID();
}
async sleep(ms) {
await new Promise((resolve) => setTimeout(resolve, ms));
}
async getQueryId(operationName) {
const cached = await runtimeQueryIds.getQueryId(operationName);
return cached ?? QUERY_IDS[operationName];
}
async refreshQueryIds() {
if (process.env.NODE_ENV === 'test') {
return;
}
try {
await runtimeQueryIds.refresh(TARGET_QUERY_ID_OPERATIONS, { force: true });
}
catch {
// ignore refresh failures; callers will fall back to baked-in IDs
}
}
async withRefreshedQueryIdsOn404(attempt) {
const firstAttempt = await attempt();
if (firstAttempt.success || !firstAttempt.had404) {
return { result: firstAttempt, refreshed: false };
}
await this.refreshQueryIds();
const secondAttempt = await attempt();
return { result: secondAttempt, refreshed: true };
}
async getTweetDetailQueryIds() {
const primary = await this.getQueryId('TweetDetail');
return Array.from(new Set([primary, '97JF30KziU00483E_8elBA', 'aFvUsJm2c-oDkJV75blV6g']));
}
async getSearchTimelineQueryIds() {
const primary = await this.getQueryId('SearchTimeline');
return Array.from(new Set([primary, 'M1jEez78PEfVfbQLvlWMvQ', '5h0kNbk3ii97rmfY6CdgAA', 'Tp1sewRU1AsZpBWhqCZicQ']));
}
async fetchWithTimeout(url, init) {
if (!this.timeoutMs || this.timeoutMs <= 0) {
return fetch(url, init);
}
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), this.timeoutMs);
try {
return await fetch(url, { ...init, signal: controller.signal });
}
finally {
clearTimeout(timeoutId);
}
}
getHeaders() {
return this.getJsonHeaders();
}
createTransactionId() {
return randomBytes(16).toString('hex');
}
getBaseHeaders() {
const headers = {
accept: '*/*',
'accept-language': 'en-US,en;q=0.9',
authorization: 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA',
'x-csrf-token': this.ct0,
'x-twitter-auth-type': 'OAuth2Session',
'x-twitter-active-user': 'yes',
'x-twitter-client-language': 'en',
'x-client-uuid': this.clientUuid,
'x-twitter-client-deviceid': this.clientDeviceId,
'x-client-transaction-id': this.createTransactionId(),
cookie: this.cookieHeader,
'user-agent': this.userAgent,
origin: 'https://x.com',
referer: 'https://x.com/',
};
if (this.clientUserId) {
headers['x-twitter-client-user-id'] = this.clientUserId;
}
return headers;
}
getJsonHeaders() {
return {
...this.getBaseHeaders(),
'content-type': 'application/json',
};
}
getUploadHeaders() {
// Note: do not set content-type; URLSearchParams/FormData need to set it (incl boundary) themselves.
return this.getBaseHeaders();
}
async ensureClientUserId() {
if (process.env.NODE_ENV === 'test') {
return;
}
if (this.clientUserId) {
return;
}
const result = await this.getCurrentUser();
if (result.success && result.user?.id) {
this.clientUserId = result.user.id;
}
}
}
//# sourceMappingURL=twitter-client-base.js.map

View File

@@ -0,0 +1,50 @@
// biome-ignore lint/correctness/useImportExtensions: JSON module import doesn't use .js extension.
import queryIds from './query-ids.json' with { type: 'json' };
export const TWITTER_API_BASE = 'https://x.com/i/api/graphql';
export const TWITTER_GRAPHQL_POST_URL = 'https://x.com/i/api/graphql';
export const TWITTER_UPLOAD_URL = 'https://upload.twitter.com/i/media/upload.json';
export const TWITTER_MEDIA_METADATA_URL = 'https://x.com/i/api/1.1/media/metadata/create.json';
export const TWITTER_STATUS_UPDATE_URL = 'https://x.com/i/api/1.1/statuses/update.json';
export const SETTINGS_SCREEN_NAME_REGEX = /"screen_name":"([^"]+)"/;
export const SETTINGS_USER_ID_REGEX = /"user_id"\s*:\s*"(\d+)"/;
export const SETTINGS_NAME_REGEX = /"name":"([^"\\]*(?:\\.[^"\\]*)*)"/;
// Query IDs rotate frequently; the values in query-ids.json are refreshed by
// scripts/update-query-ids.ts. The fallback values keep the client usable if
// the file is missing or incomplete.
export const FALLBACK_QUERY_IDS = {
CreateTweet: 'TAJw1rBsjAtdNgTdlo2oeg',
CreateRetweet: 'ojPdsZsimiJrUGLR1sjUtA',
DeleteRetweet: 'iQtK4dl5hBmXewYZuEOKVw',
CreateFriendship: '8h9JVdV8dlSyqyRDJEPCsA',
DestroyFriendship: 'ppXWuagMNXgvzx6WoXBW0Q',
FavoriteTweet: 'lI07N6Otwv1PhnEgXILM7A',
UnfavoriteTweet: 'ZYKSe-w7KEslx3JhSIk5LA',
CreateBookmark: 'aoDbu3RHznuiSkQ9aNM67Q',
DeleteBookmark: 'Wlmlj2-xzyS1GN3a6cj-mQ',
TweetDetail: '97JF30KziU00483E_8elBA',
SearchTimeline: 'M1jEez78PEfVfbQLvlWMvQ',
UserArticlesTweets: '8zBy9h4L90aDL02RsBcCFg',
UserTweets: 'Wms1GvIiHXAPBaCr9KblaA',
Bookmarks: 'RV1g3b8n_SGOHwkqKYSCFw',
Following: 'BEkNpEt5pNETESoqMsTEGA',
Followers: 'kuFUYP9eV1FPoEy4N-pi7w',
Likes: 'JR2gceKucIKcVNB_9JkhsA',
BookmarkFolderTimeline: 'KJIQpsvxrTfRIlbaRIySHQ',
ListOwnerships: 'wQcOSjSQ8NtgxIwvYl1lMg',
ListMemberships: 'BlEXXdARdSeL_0KyKHHvvg',
ListLatestTweetsTimeline: '2TemLyqrMpTeAmysdbnVqw',
ListByRestId: 'wXzyA5vM_aVkBL9G8Vp3kw',
HomeTimeline: 'edseUwk9sP5Phz__9TIRnA',
HomeLatestTimeline: 'iOEZpOdfekFsxSlPQCQtPg',
ExploreSidebar: 'lpSN4M6qpimkF4nRFPE3nQ',
ExplorePage: 'kheAINB_4pzRDqkzG3K-ng',
GenericTimelineById: 'uGSr7alSjR9v6QJAIaqSKQ',
TrendHistory: 'Sj4T-jSB9pr0Mxtsc1UKZQ',
AboutAccountQuery: 'zs_jFPFT78rBpXv9Z3U2YQ',
};
export const QUERY_IDS = {
...FALLBACK_QUERY_IDS,
...queryIds,
};
export const TARGET_QUERY_ID_OPERATIONS = Object.keys(FALLBACK_QUERY_IDS);
//# sourceMappingURL=twitter-client-constants.js.map

View File

@@ -0,0 +1,347 @@
import { applyFeatureOverrides } from './runtime-features.js';
export function buildArticleFeatures() {
return applyFeatureOverrides('article', {
rweb_video_screen_enabled: true,
profile_label_improvements_pcf_label_in_post_enabled: true,
responsive_web_profile_redirect_enabled: true,
rweb_tipjar_consumption_enabled: true,
verified_phone_label_enabled: false,
creator_subscriptions_tweet_preview_api_enabled: true,
responsive_web_graphql_timeline_navigation_enabled: true,
responsive_web_graphql_exclude_directive_enabled: true,
responsive_web_graphql_skip_user_profile_image_extensions_enabled: false,
premium_content_api_read_enabled: false,
communities_web_enable_tweet_community_results_fetch: true,
c9s_tweet_anatomy_moderator_badge_enabled: true,
responsive_web_grok_analyze_button_fetch_trends_enabled: false,
responsive_web_grok_analyze_post_followups_enabled: false,
responsive_web_grok_annotations_enabled: false,
responsive_web_jetfuel_frame: true,
post_ctas_fetch_enabled: true,
responsive_web_grok_share_attachment_enabled: true,
articles_preview_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
graphql_is_translatable_rweb_tweet_is_translatable_enabled: true,
view_counts_everywhere_api_enabled: true,
longform_notetweets_consumption_enabled: true,
responsive_web_twitter_article_tweet_consumption_enabled: true,
tweet_awards_web_tipping_enabled: false,
responsive_web_grok_show_grok_translated_post: false,
responsive_web_grok_analysis_button_from_backend: true,
creator_subscriptions_quote_tweet_preview_enabled: false,
freedom_of_speech_not_reach_fetch_enabled: true,
standardized_nudges_misinfo: true,
tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled: true,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
responsive_web_grok_image_annotation_enabled: true,
responsive_web_grok_imagine_annotation_enabled: true,
responsive_web_grok_community_note_auto_translation_is_enabled: false,
responsive_web_enhance_cards_enabled: false,
});
}
export function buildTweetDetailFeatures() {
return applyFeatureOverrides('tweetDetail', {
...buildArticleFeatures(),
responsive_web_graphql_exclude_directive_enabled: true,
communities_web_enable_tweet_community_results_fetch: true,
responsive_web_twitter_article_plain_text_enabled: true,
responsive_web_twitter_article_seed_tweet_detail_enabled: true,
responsive_web_twitter_article_seed_tweet_summary_enabled: true,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
tweet_awards_web_tipping_enabled: false,
creator_subscriptions_quote_tweet_preview_enabled: false,
verified_phone_label_enabled: false,
});
}
export function buildArticleFieldToggles() {
return {
withPayments: false,
withAuxiliaryUserLabels: false,
withArticleRichContentState: true,
withArticlePlainText: true,
withGrokAnalyze: false,
withDisallowedReplyControls: false,
};
}
export function buildSearchFeatures() {
return applyFeatureOverrides('search', {
rweb_video_screen_enabled: true,
profile_label_improvements_pcf_label_in_post_enabled: true,
responsive_web_profile_redirect_enabled: true,
rweb_tipjar_consumption_enabled: true,
verified_phone_label_enabled: false,
creator_subscriptions_tweet_preview_api_enabled: true,
responsive_web_graphql_timeline_navigation_enabled: true,
responsive_web_graphql_exclude_directive_enabled: true,
responsive_web_graphql_skip_user_profile_image_extensions_enabled: false,
premium_content_api_read_enabled: false,
communities_web_enable_tweet_community_results_fetch: true,
c9s_tweet_anatomy_moderator_badge_enabled: true,
responsive_web_grok_analyze_button_fetch_trends_enabled: false,
responsive_web_grok_analyze_post_followups_enabled: false,
responsive_web_grok_annotations_enabled: false,
responsive_web_jetfuel_frame: true,
post_ctas_fetch_enabled: true,
responsive_web_grok_share_attachment_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
graphql_is_translatable_rweb_tweet_is_translatable_enabled: true,
view_counts_everywhere_api_enabled: true,
longform_notetweets_consumption_enabled: true,
responsive_web_twitter_article_tweet_consumption_enabled: true,
tweet_awards_web_tipping_enabled: false,
responsive_web_grok_show_grok_translated_post: false,
responsive_web_grok_analysis_button_from_backend: true,
creator_subscriptions_quote_tweet_preview_enabled: false,
freedom_of_speech_not_reach_fetch_enabled: true,
standardized_nudges_misinfo: true,
tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled: true,
rweb_video_timestamps_enabled: true,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
responsive_web_grok_image_annotation_enabled: true,
responsive_web_grok_imagine_annotation_enabled: true,
responsive_web_grok_community_note_auto_translation_is_enabled: false,
articles_preview_enabled: true,
responsive_web_enhance_cards_enabled: false,
});
}
export function buildTweetCreateFeatures() {
return applyFeatureOverrides('tweetCreate', {
rweb_video_screen_enabled: true,
creator_subscriptions_tweet_preview_api_enabled: true,
premium_content_api_read_enabled: false,
communities_web_enable_tweet_community_results_fetch: true,
c9s_tweet_anatomy_moderator_badge_enabled: true,
responsive_web_grok_analyze_button_fetch_trends_enabled: false,
responsive_web_grok_analyze_post_followups_enabled: false,
responsive_web_grok_annotations_enabled: false,
responsive_web_jetfuel_frame: true,
post_ctas_fetch_enabled: true,
responsive_web_grok_share_attachment_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
graphql_is_translatable_rweb_tweet_is_translatable_enabled: true,
view_counts_everywhere_api_enabled: true,
longform_notetweets_consumption_enabled: true,
responsive_web_twitter_article_tweet_consumption_enabled: true,
tweet_awards_web_tipping_enabled: false,
responsive_web_grok_show_grok_translated_post: false,
responsive_web_grok_analysis_button_from_backend: true,
creator_subscriptions_quote_tweet_preview_enabled: false,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
profile_label_improvements_pcf_label_in_post_enabled: true,
responsive_web_profile_redirect_enabled: false,
rweb_tipjar_consumption_enabled: true,
verified_phone_label_enabled: false,
articles_preview_enabled: true,
responsive_web_grok_community_note_auto_translation_is_enabled: false,
responsive_web_graphql_skip_user_profile_image_extensions_enabled: false,
freedom_of_speech_not_reach_fetch_enabled: true,
standardized_nudges_misinfo: true,
tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled: true,
responsive_web_grok_image_annotation_enabled: true,
responsive_web_grok_imagine_annotation_enabled: true,
responsive_web_graphql_timeline_navigation_enabled: true,
responsive_web_enhance_cards_enabled: false,
});
}
export function buildTimelineFeatures() {
return applyFeatureOverrides('timeline', {
...buildSearchFeatures(),
blue_business_profile_image_shape_enabled: true,
responsive_web_text_conversations_enabled: false,
tweetypie_unmention_optimization_enabled: true,
vibe_api_enabled: true,
responsive_web_twitter_blue_verified_badge_is_enabled: true,
interactive_text_enabled: true,
longform_notetweets_richtext_consumption_enabled: true,
responsive_web_media_download_video_enabled: false,
});
}
export function buildBookmarksFeatures() {
return applyFeatureOverrides('bookmarks', {
...buildTimelineFeatures(),
graphql_timeline_v2_bookmark_timeline: true,
});
}
export function buildLikesFeatures() {
return applyFeatureOverrides('likes', buildTimelineFeatures());
}
export function buildListsFeatures() {
return applyFeatureOverrides('lists', {
rweb_video_screen_enabled: true,
profile_label_improvements_pcf_label_in_post_enabled: true,
responsive_web_profile_redirect_enabled: true,
rweb_tipjar_consumption_enabled: true,
verified_phone_label_enabled: false,
creator_subscriptions_tweet_preview_api_enabled: true,
responsive_web_graphql_timeline_navigation_enabled: true,
responsive_web_graphql_exclude_directive_enabled: true,
responsive_web_graphql_skip_user_profile_image_extensions_enabled: false,
premium_content_api_read_enabled: false,
communities_web_enable_tweet_community_results_fetch: true,
c9s_tweet_anatomy_moderator_badge_enabled: true,
responsive_web_grok_analyze_button_fetch_trends_enabled: false,
responsive_web_grok_analyze_post_followups_enabled: false,
responsive_web_grok_annotations_enabled: false,
responsive_web_jetfuel_frame: true,
post_ctas_fetch_enabled: true,
responsive_web_grok_share_attachment_enabled: true,
articles_preview_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
graphql_is_translatable_rweb_tweet_is_translatable_enabled: true,
view_counts_everywhere_api_enabled: true,
longform_notetweets_consumption_enabled: true,
responsive_web_twitter_article_tweet_consumption_enabled: true,
tweet_awards_web_tipping_enabled: false,
responsive_web_grok_show_grok_translated_post: false,
responsive_web_grok_analysis_button_from_backend: true,
creator_subscriptions_quote_tweet_preview_enabled: false,
freedom_of_speech_not_reach_fetch_enabled: true,
standardized_nudges_misinfo: true,
tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled: true,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
responsive_web_grok_image_annotation_enabled: true,
responsive_web_grok_imagine_annotation_enabled: true,
responsive_web_grok_community_note_auto_translation_is_enabled: false,
responsive_web_enhance_cards_enabled: false,
blue_business_profile_image_shape_enabled: false,
responsive_web_text_conversations_enabled: false,
tweetypie_unmention_optimization_enabled: true,
vibe_api_enabled: false,
interactive_text_enabled: false,
});
}
export function buildHomeTimelineFeatures() {
return applyFeatureOverrides('homeTimeline', {
...buildTimelineFeatures(),
});
}
export function buildUserTweetsFeatures() {
return applyFeatureOverrides('userTweets', {
rweb_video_screen_enabled: false,
profile_label_improvements_pcf_label_in_post_enabled: true,
responsive_web_profile_redirect_enabled: false,
rweb_tipjar_consumption_enabled: true,
verified_phone_label_enabled: false,
creator_subscriptions_tweet_preview_api_enabled: true,
responsive_web_graphql_timeline_navigation_enabled: true,
responsive_web_graphql_skip_user_profile_image_extensions_enabled: false,
premium_content_api_read_enabled: false,
communities_web_enable_tweet_community_results_fetch: true,
c9s_tweet_anatomy_moderator_badge_enabled: true,
responsive_web_grok_analyze_button_fetch_trends_enabled: false,
responsive_web_grok_analyze_post_followups_enabled: true,
responsive_web_jetfuel_frame: true,
post_ctas_fetch_enabled: true,
responsive_web_grok_share_attachment_enabled: true,
responsive_web_grok_annotations_enabled: false,
articles_preview_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
graphql_is_translatable_rweb_tweet_is_translatable_enabled: true,
view_counts_everywhere_api_enabled: true,
longform_notetweets_consumption_enabled: true,
responsive_web_twitter_article_tweet_consumption_enabled: true,
tweet_awards_web_tipping_enabled: false,
responsive_web_grok_show_grok_translated_post: true,
responsive_web_grok_analysis_button_from_backend: true,
creator_subscriptions_quote_tweet_preview_enabled: false,
freedom_of_speech_not_reach_fetch_enabled: true,
standardized_nudges_misinfo: true,
tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled: true,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
responsive_web_grok_image_annotation_enabled: true,
responsive_web_grok_imagine_annotation_enabled: true,
responsive_web_grok_community_note_auto_translation_is_enabled: false,
responsive_web_enhance_cards_enabled: false,
});
}
export function buildFollowingFeatures() {
return applyFeatureOverrides('following', {
rweb_video_screen_enabled: true,
profile_label_improvements_pcf_label_in_post_enabled: false,
responsive_web_profile_redirect_enabled: true,
rweb_tipjar_consumption_enabled: true,
verified_phone_label_enabled: false,
creator_subscriptions_tweet_preview_api_enabled: true,
responsive_web_graphql_timeline_navigation_enabled: true,
responsive_web_graphql_skip_user_profile_image_extensions_enabled: false,
premium_content_api_read_enabled: true,
communities_web_enable_tweet_community_results_fetch: true,
c9s_tweet_anatomy_moderator_badge_enabled: true,
responsive_web_grok_analyze_button_fetch_trends_enabled: false,
responsive_web_grok_analyze_post_followups_enabled: false,
responsive_web_grok_annotations_enabled: false,
responsive_web_jetfuel_frame: false,
post_ctas_fetch_enabled: true,
responsive_web_grok_share_attachment_enabled: false,
articles_preview_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
graphql_is_translatable_rweb_tweet_is_translatable_enabled: true,
view_counts_everywhere_api_enabled: true,
longform_notetweets_consumption_enabled: true,
responsive_web_twitter_article_tweet_consumption_enabled: true,
tweet_awards_web_tipping_enabled: true,
responsive_web_grok_show_grok_translated_post: false,
responsive_web_grok_analysis_button_from_backend: false,
creator_subscriptions_quote_tweet_preview_enabled: false,
freedom_of_speech_not_reach_fetch_enabled: true,
standardized_nudges_misinfo: true,
tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled: true,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
responsive_web_grok_image_annotation_enabled: false,
responsive_web_grok_imagine_annotation_enabled: false,
responsive_web_grok_community_note_auto_translation_is_enabled: false,
responsive_web_enhance_cards_enabled: false,
});
}
export function buildExploreFeatures() {
return applyFeatureOverrides('explore', {
rweb_video_screen_enabled: true,
profile_label_improvements_pcf_label_in_post_enabled: true,
responsive_web_profile_redirect_enabled: true,
rweb_tipjar_consumption_enabled: true,
verified_phone_label_enabled: false,
creator_subscriptions_tweet_preview_api_enabled: true,
responsive_web_graphql_timeline_navigation_enabled: true,
responsive_web_graphql_exclude_directive_enabled: true,
responsive_web_graphql_skip_user_profile_image_extensions_enabled: false,
premium_content_api_read_enabled: false,
communities_web_enable_tweet_community_results_fetch: true,
c9s_tweet_anatomy_moderator_badge_enabled: true,
responsive_web_grok_analyze_button_fetch_trends_enabled: true,
responsive_web_grok_analyze_post_followups_enabled: true,
responsive_web_grok_annotations_enabled: true,
responsive_web_jetfuel_frame: true,
responsive_web_grok_share_attachment_enabled: true,
articles_preview_enabled: true,
responsive_web_edit_tweet_api_enabled: true,
graphql_is_translatable_rweb_tweet_is_translatable_enabled: true,
view_counts_everywhere_api_enabled: true,
longform_notetweets_consumption_enabled: true,
responsive_web_twitter_article_tweet_consumption_enabled: true,
tweet_awards_web_tipping_enabled: false,
responsive_web_grok_show_grok_translated_post: true,
responsive_web_grok_analysis_button_from_backend: true,
creator_subscriptions_quote_tweet_preview_enabled: false,
freedom_of_speech_not_reach_fetch_enabled: true,
standardized_nudges_misinfo: true,
tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled: true,
longform_notetweets_rich_text_read_enabled: true,
longform_notetweets_inline_media_enabled: true,
responsive_web_grok_image_annotation_enabled: true,
responsive_web_grok_imagine_annotation_enabled: true,
responsive_web_grok_community_note_auto_translation_is_enabled: true,
responsive_web_enhance_cards_enabled: false,
// Additional features required for ExploreSidebar
post_ctas_fetch_enabled: true,
rweb_video_timestamps_enabled: true,
});
}
//# sourceMappingURL=twitter-client-features.js.map

View File

@@ -0,0 +1,157 @@
import { TWITTER_API_BASE } from './twitter-client-constants.js';
import { buildSearchFeatures } from './twitter-client-features.js';
import { extractCursorFromInstructions, parseTweetsFromInstructions } from './twitter-client-utils.js';
const RAW_QUERY_MISSING_REGEX = /must be defined/i;
function isQueryIdMismatch(payload) {
try {
const parsed = JSON.parse(payload);
return (parsed.errors?.some((error) => {
if (error?.extensions?.code === 'GRAPHQL_VALIDATION_FAILED') {
return true;
}
if (error?.path?.includes('rawQuery') && RAW_QUERY_MISSING_REGEX.test(error.message ?? '')) {
return true;
}
return false;
}) ?? false);
}
catch {
return false;
}
}
export function withSearch(Base) {
class TwitterClientSearch extends Base {
// biome-ignore lint/complexity/noUselessConstructor lint/suspicious/noExplicitAny: TS mixin constructor requirement.
constructor(...args) {
super(...args);
}
/**
* Search for tweets matching a query
*/
async search(query, count = 20, options = {}) {
return this.searchPaged(query, count, options);
}
/**
* Get all search results (paged)
*/
async getAllSearchResults(query, options) {
return this.searchPaged(query, Number.POSITIVE_INFINITY, options);
}
async searchPaged(query, limit, options = {}) {
const features = buildSearchFeatures();
const pageSize = 20;
const seen = new Set();
const tweets = [];
let cursor = options.cursor;
let nextCursor;
let pagesFetched = 0;
const { includeRaw = false, maxPages } = options;
const fetchPage = async (pageCount, pageCursor) => {
let lastError;
let had404 = false;
const queryIds = await this.getSearchTimelineQueryIds();
for (const queryId of queryIds) {
const variables = {
rawQuery: query,
count: pageCount,
querySource: 'typed_query',
product: 'Latest',
...(pageCursor ? { cursor: pageCursor } : {}),
};
const params = new URLSearchParams({
variables: JSON.stringify(variables),
});
const url = `${TWITTER_API_BASE}/${queryId}/SearchTimeline?${params.toString()}`;
try {
const response = await this.fetchWithTimeout(url, {
method: 'POST',
headers: this.getHeaders(),
body: JSON.stringify({ features, queryId }),
});
if (response.status === 404) {
had404 = true;
lastError = `HTTP ${response.status}`;
continue;
}
if (!response.ok) {
const text = await response.text();
const shouldRefreshQueryIds = (response.status === 400 || response.status === 422) && isQueryIdMismatch(text);
return {
success: false,
error: `HTTP ${response.status}: ${text.slice(0, 200)}`,
had404: had404 || shouldRefreshQueryIds,
};
}
const data = (await response.json());
if (data.errors && data.errors.length > 0) {
const shouldRefreshQueryIds = data.errors.some((error) => error?.extensions?.code === 'GRAPHQL_VALIDATION_FAILED');
return {
success: false,
error: data.errors.map((e) => e.message).join(', '),
had404: had404 || shouldRefreshQueryIds,
};
}
const instructions = data.data?.search_by_raw_query?.search_timeline?.timeline?.instructions;
const pageTweets = parseTweetsFromInstructions(instructions, { quoteDepth: this.quoteDepth, includeRaw });
const nextCursor = extractCursorFromInstructions(instructions);
return { success: true, tweets: pageTweets, cursor: nextCursor, had404 };
}
catch (error) {
lastError = error instanceof Error ? error.message : String(error);
}
}
return { success: false, error: lastError ?? 'Unknown error fetching search results', had404 };
};
const fetchWithRefresh = async (pageCount, pageCursor) => {
const firstAttempt = await fetchPage(pageCount, pageCursor);
if (firstAttempt.success) {
return firstAttempt;
}
if (firstAttempt.had404) {
await this.refreshQueryIds();
const secondAttempt = await fetchPage(pageCount, pageCursor);
if (secondAttempt.success) {
return secondAttempt;
}
return { success: false, error: secondAttempt.error };
}
return { success: false, error: firstAttempt.error };
};
const unlimited = limit === Number.POSITIVE_INFINITY;
while (unlimited || tweets.length < limit) {
const pageCount = unlimited ? pageSize : Math.min(pageSize, limit - tweets.length);
const page = await fetchWithRefresh(pageCount, cursor);
if (!page.success) {
return { success: false, error: page.error };
}
pagesFetched += 1;
let added = 0;
for (const tweet of page.tweets) {
if (seen.has(tweet.id)) {
continue;
}
seen.add(tweet.id);
tweets.push(tweet);
added += 1;
if (!unlimited && tweets.length >= limit) {
break;
}
}
const pageCursor = page.cursor;
if (!pageCursor || pageCursor === cursor || page.tweets.length === 0 || added === 0) {
nextCursor = undefined;
break;
}
if (maxPages && pagesFetched >= maxPages) {
nextCursor = pageCursor;
break;
}
cursor = pageCursor;
nextCursor = pageCursor;
}
return { success: true, tweets, nextCursor };
}
}
return TwitterClientSearch;
}
//# sourceMappingURL=twitter-client-search.js.map

View File

@@ -0,0 +1,2 @@
export {};
//# sourceMappingURL=twitter-client-types.js.map

View File

@@ -0,0 +1,511 @@
export function normalizeQuoteDepth(value) {
if (value === undefined || value === null) {
return 1;
}
if (!Number.isFinite(value)) {
return 1;
}
return Math.max(0, Math.floor(value));
}
export function firstText(...values) {
for (const value of values) {
if (typeof value === 'string') {
const trimmed = value.trim();
if (trimmed) {
return trimmed;
}
}
}
return undefined;
}
export function collectTextFields(value, keys, output) {
if (!value) {
return;
}
if (typeof value === 'string') {
return;
}
if (Array.isArray(value)) {
for (const item of value) {
collectTextFields(item, keys, output);
}
return;
}
if (typeof value === 'object') {
for (const [key, nested] of Object.entries(value)) {
if (keys.has(key)) {
if (typeof nested === 'string') {
const trimmed = nested.trim();
if (trimmed) {
output.push(trimmed);
}
continue;
}
}
collectTextFields(nested, keys, output);
}
}
}
export function uniqueOrdered(values) {
const seen = new Set();
const result = [];
for (const value of values) {
if (seen.has(value)) {
continue;
}
seen.add(value);
result.push(value);
}
return result;
}
/**
* Renders a Draft.js content_state into readable markdown/text format.
* Handles blocks (paragraphs, headers, lists) and entities (code blocks, links, tweets, dividers).
*/
export function renderContentState(contentState) {
if (!contentState?.blocks || contentState.blocks.length === 0) {
return undefined;
}
// Build entity lookup map from array/object formats
const entityMap = new Map();
const rawEntityMap = contentState.entityMap ?? [];
if (Array.isArray(rawEntityMap)) {
for (const entry of rawEntityMap) {
const key = Number.parseInt(entry.key, 10);
if (!Number.isNaN(key)) {
entityMap.set(key, entry.value);
}
}
}
else {
for (const [key, value] of Object.entries(rawEntityMap)) {
const keyNumber = Number.parseInt(key, 10);
if (!Number.isNaN(keyNumber)) {
entityMap.set(keyNumber, value);
}
}
}
const outputLines = [];
let orderedListCounter = 0;
let previousBlockType;
for (const block of contentState.blocks) {
// Reset ordered list counter when leaving ordered list context
if (block.type !== 'ordered-list-item' && previousBlockType === 'ordered-list-item') {
orderedListCounter = 0;
}
switch (block.type) {
case 'unstyled': {
// Plain paragraph - just output text with any inline formatting
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(text);
}
break;
}
case 'header-one': {
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(`# ${text}`);
}
break;
}
case 'header-two': {
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(`## ${text}`);
}
break;
}
case 'header-three': {
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(`### ${text}`);
}
break;
}
case 'unordered-list-item': {
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(`- ${text}`);
}
break;
}
case 'ordered-list-item': {
orderedListCounter++;
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(`${orderedListCounter}. ${text}`);
}
break;
}
case 'blockquote': {
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(`> ${text}`);
}
break;
}
case 'atomic': {
// Atomic blocks are placeholders for embedded entities
const entityContent = renderAtomicBlock(block, entityMap);
if (entityContent) {
outputLines.push(entityContent);
}
break;
}
default: {
// Fallback: just output the text
const text = renderBlockText(block, entityMap);
if (text) {
outputLines.push(text);
}
}
}
previousBlockType = block.type;
}
const result = outputLines.join('\n\n');
return result.trim() || undefined;
}
/**
* Renders text content of a block, applying inline link entities.
*/
function renderBlockText(block, entityMap) {
let text = block.text;
// Handle LINK entities by appending URL in markdown format
// Process in reverse order to not mess up offsets
const linkRanges = (block.entityRanges ?? [])
.filter((range) => {
const entity = entityMap.get(range.key);
return entity?.type === 'LINK' && entity.data.url;
})
.sort((a, b) => b.offset - a.offset);
for (const range of linkRanges) {
const entity = entityMap.get(range.key);
if (entity?.data.url) {
const linkText = text.slice(range.offset, range.offset + range.length);
const markdownLink = `[${linkText}](${entity.data.url})`;
text = text.slice(0, range.offset) + markdownLink + text.slice(range.offset + range.length);
}
}
return text.trim();
}
/**
* Renders an atomic block by looking up its entity and returning appropriate content.
*/
function renderAtomicBlock(block, entityMap) {
const entityRanges = block.entityRanges ?? [];
if (entityRanges.length === 0) {
return undefined;
}
const entityKey = entityRanges[0].key;
const entity = entityMap.get(entityKey);
if (!entity) {
return undefined;
}
switch (entity.type) {
case 'MARKDOWN':
// Code blocks and other markdown content - output as-is
return entity.data.markdown?.trim();
case 'DIVIDER':
return '---';
case 'TWEET':
if (entity.data.tweetId) {
return `[Embedded Tweet: https://x.com/i/status/${entity.data.tweetId}]`;
}
return undefined;
case 'LINK':
if (entity.data.url) {
return `[Link: ${entity.data.url}]`;
}
return undefined;
case 'IMAGE':
// Images in atomic blocks - could extract URL if available
return '[Image]';
default:
return undefined;
}
}
export function extractArticleText(result) {
const article = result?.article;
if (!article) {
return undefined;
}
const articleResult = article.article_results?.result ?? article;
if (process.env.BIRD_DEBUG_ARTICLE === '1') {
console.error('[bird][debug][article] payload:', JSON.stringify({
rest_id: result?.rest_id,
article: articleResult,
note_tweet: result?.note_tweet?.note_tweet_results?.result ?? null,
}, null, 2));
}
const title = firstText(articleResult.title, article.title);
// Try to render from rich content_state first (Draft.js format with blocks + entityMap)
// This preserves code blocks, embedded tweets, markdown, etc.
const contentState = article.article_results?.result?.content_state;
const richBody = renderContentState(contentState);
if (richBody) {
// Rich content found - prepend title if not already included
if (title) {
const normalizedTitle = title.trim();
const trimmedBody = richBody.trimStart();
const headingMatches = [`# ${normalizedTitle}`, `## ${normalizedTitle}`, `### ${normalizedTitle}`];
const hasTitle = trimmedBody === normalizedTitle ||
trimmedBody.startsWith(`${normalizedTitle}\n`) ||
headingMatches.some((heading) => trimmedBody.startsWith(heading));
if (!hasTitle) {
return `${title}\n\n${richBody}`;
}
}
return richBody;
}
// Fallback to plain text extraction for articles without rich content_state
let body = firstText(articleResult.plain_text, article.plain_text, articleResult.body?.text, articleResult.body?.richtext?.text, articleResult.body?.rich_text?.text, articleResult.content?.text, articleResult.content?.richtext?.text, articleResult.content?.rich_text?.text, articleResult.text, articleResult.richtext?.text, articleResult.rich_text?.text, article.body?.text, article.body?.richtext?.text, article.body?.rich_text?.text, article.content?.text, article.content?.richtext?.text, article.content?.rich_text?.text, article.text, article.richtext?.text, article.rich_text?.text);
if (body && title && body.trim() === title.trim()) {
body = undefined;
}
if (!body) {
const collected = [];
collectTextFields(articleResult, new Set(['text', 'title']), collected);
collectTextFields(article, new Set(['text', 'title']), collected);
const unique = uniqueOrdered(collected);
const filtered = title ? unique.filter((value) => value !== title) : unique;
if (filtered.length > 0) {
body = filtered.join('\n\n');
}
}
if (title && body && !body.startsWith(title)) {
return `${title}\n\n${body}`;
}
return body ?? title;
}
export function extractNoteTweetText(result) {
const note = result?.note_tweet?.note_tweet_results?.result;
if (!note) {
return undefined;
}
return firstText(note.text, note.richtext?.text, note.rich_text?.text, note.content?.text, note.content?.richtext?.text, note.content?.rich_text?.text);
}
export function extractTweetText(result) {
return extractArticleText(result) ?? extractNoteTweetText(result) ?? firstText(result?.legacy?.full_text);
}
export function extractArticleMetadata(result) {
const article = result?.article;
if (!article) {
return undefined;
}
const articleResult = article.article_results?.result ?? article;
const title = firstText(articleResult.title, article.title);
if (!title) {
return undefined;
}
// preview_text is available in home timeline responses
const previewText = firstText(articleResult.preview_text, article.preview_text);
return { title, previewText };
}
export function extractMedia(result) {
// Prefer extended_entities (has video info), fall back to entities
const rawMedia = result?.legacy?.extended_entities?.media ?? result?.legacy?.entities?.media;
if (!rawMedia || rawMedia.length === 0) {
return undefined;
}
const media = [];
for (const item of rawMedia) {
if (!item.type || !item.media_url_https) {
continue;
}
const mediaItem = {
type: item.type,
url: item.media_url_https,
};
// Get dimensions from largest available size
const sizes = item.sizes;
if (sizes?.large) {
mediaItem.width = sizes.large.w;
mediaItem.height = sizes.large.h;
}
else if (sizes?.medium) {
mediaItem.width = sizes.medium.w;
mediaItem.height = sizes.medium.h;
}
// For thumbnails/previews
if (sizes?.small) {
mediaItem.previewUrl = `${item.media_url_https}:small`;
}
// Extract video URL for video/animated_gif
if ((item.type === 'video' || item.type === 'animated_gif') && item.video_info?.variants) {
// Prefer highest bitrate MP4, fall back to first MP4 when bitrate is missing.
const mp4Variants = item.video_info.variants.filter((v) => v.content_type === 'video/mp4' && typeof v.url === 'string');
const mp4WithBitrate = mp4Variants
.filter((v) => typeof v.bitrate === 'number')
.sort((a, b) => b.bitrate - a.bitrate);
const selectedVariant = mp4WithBitrate[0] ?? mp4Variants[0];
if (selectedVariant) {
mediaItem.videoUrl = selectedVariant.url;
}
if (typeof item.video_info.duration_millis === 'number') {
mediaItem.durationMs = item.video_info.duration_millis;
}
}
media.push(mediaItem);
}
return media.length > 0 ? media : undefined;
}
export function unwrapTweetResult(result) {
if (!result) {
return undefined;
}
if (result.tweet) {
return result.tweet;
}
return result;
}
export function mapTweetResult(result, quoteDepthOrOptions) {
const options = typeof quoteDepthOrOptions === 'number' ? { quoteDepth: quoteDepthOrOptions } : quoteDepthOrOptions;
const { quoteDepth, includeRaw = false } = options;
const userResult = result?.core?.user_results?.result;
const userLegacy = userResult?.legacy;
const userCore = userResult?.core;
const username = userLegacy?.screen_name ?? userCore?.screen_name;
const name = userLegacy?.name ?? userCore?.name ?? username;
const userId = userResult?.rest_id;
if (!result?.rest_id || !username) {
return undefined;
}
const text = extractTweetText(result);
if (!text) {
return undefined;
}
let quotedTweet;
if (quoteDepth > 0) {
const quotedResult = unwrapTweetResult(result.quoted_status_result?.result);
if (quotedResult) {
quotedTweet = mapTweetResult(quotedResult, { quoteDepth: quoteDepth - 1, includeRaw });
}
}
const media = extractMedia(result);
const article = extractArticleMetadata(result);
const tweetData = {
id: result.rest_id,
text,
createdAt: result.legacy?.created_at,
replyCount: result.legacy?.reply_count,
retweetCount: result.legacy?.retweet_count,
likeCount: result.legacy?.favorite_count,
conversationId: result.legacy?.conversation_id_str,
inReplyToStatusId: result.legacy?.in_reply_to_status_id_str ?? undefined,
author: {
username,
name: name || username,
},
authorId: userId,
quotedTweet,
media,
article,
};
if (includeRaw) {
tweetData._raw = result;
}
return tweetData;
}
export function findTweetInInstructions(instructions, tweetId) {
if (!instructions) {
return undefined;
}
for (const instruction of instructions) {
for (const entry of instruction.entries || []) {
const result = entry.content?.itemContent?.tweet_results?.result;
if (result?.rest_id === tweetId) {
return result;
}
}
}
return undefined;
}
export function collectTweetResultsFromEntry(entry) {
const results = [];
const pushResult = (result) => {
if (result?.rest_id) {
results.push(result);
}
};
const content = entry.content;
pushResult(content?.itemContent?.tweet_results?.result);
pushResult(content?.item?.itemContent?.tweet_results?.result);
for (const item of content?.items ?? []) {
pushResult(item?.item?.itemContent?.tweet_results?.result);
pushResult(item?.itemContent?.tweet_results?.result);
pushResult(item?.content?.itemContent?.tweet_results?.result);
}
return results;
}
export function parseTweetsFromInstructions(instructions, quoteDepthOrOptions) {
const options = typeof quoteDepthOrOptions === 'number' ? { quoteDepth: quoteDepthOrOptions } : quoteDepthOrOptions;
const { quoteDepth, includeRaw = false } = options;
const tweets = [];
const seen = new Set();
for (const instruction of instructions ?? []) {
for (const entry of instruction.entries ?? []) {
const results = collectTweetResultsFromEntry(entry);
for (const result of results) {
const mapped = mapTweetResult(result, { quoteDepth, includeRaw });
if (!mapped || seen.has(mapped.id)) {
continue;
}
seen.add(mapped.id);
tweets.push(mapped);
}
}
}
return tweets;
}
export function extractCursorFromInstructions(instructions, cursorType = 'Bottom') {
for (const instruction of instructions ?? []) {
for (const entry of instruction.entries ?? []) {
const content = entry.content;
if (content?.cursorType === cursorType && typeof content.value === 'string' && content.value.length > 0) {
return content.value;
}
}
}
return undefined;
}
export function parseUsersFromInstructions(instructions) {
if (!instructions) {
return [];
}
const users = [];
for (const instruction of instructions) {
if (!instruction.entries) {
continue;
}
for (const entry of instruction.entries) {
const content = entry?.content;
const rawUserResult = content?.itemContent?.user_results?.result;
const userResult = rawUserResult?.__typename === 'UserWithVisibilityResults' && rawUserResult.user
? rawUserResult.user
: rawUserResult;
if (!userResult || userResult.__typename !== 'User') {
continue;
}
const legacy = userResult.legacy;
const core = userResult.core;
const username = legacy?.screen_name ?? core?.screen_name;
if (!userResult.rest_id || !username) {
continue;
}
users.push({
id: userResult.rest_id,
username,
name: legacy?.name ?? core?.name ?? username,
description: legacy?.description,
followersCount: legacy?.followers_count,
followingCount: legacy?.friends_count,
isBlueVerified: userResult.is_blue_verified,
profileImageUrl: legacy?.profile_image_url_https ?? userResult.avatar?.image_url,
createdAt: legacy?.created_at ?? core?.created_at,
});
}
}
return users;
}
//# sourceMappingURL=twitter-client-utils.js.map

View File

@@ -0,0 +1,22 @@
MIT License
Copyright (c) 2025 Peter Steinberger
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,29 @@
# @steipete/sweet-cookie
Inline-first browser cookie extraction for local tooling (no native addons).
Supports:
- Inline payloads (JSON / base64 / file) — most reliable path.
- Local browser reads (best effort): Chrome, Edge, Firefox, Safari (macOS).
Install:
```bash
npm i @steipete/sweet-cookie
```
Usage:
```ts
import { getCookies, toCookieHeader } from '@steipete/sweet-cookie';
const { cookies, warnings } = await getCookies({
url: 'https://example.com/',
names: ['session', 'csrf'],
browsers: ['chrome', 'edge', 'firefox', 'safari'],
});
for (const w of warnings) console.warn(w);
const cookieHeader = toCookieHeader(cookies, { dedupeByName: true });
```
Docs + extension exporter: see the repo root README.

View File

@@ -0,0 +1,3 @@
export { getCookies, toCookieHeader } from './public.js';
export type { BrowserName, Cookie, CookieHeaderOptions, CookieMode, CookieSameSite, GetCookiesOptions, GetCookiesResult, } from './types.js';
//# sourceMappingURL=index.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,cAAc,EAAE,MAAM,aAAa,CAAC;AACzD,YAAY,EACX,WAAW,EACX,MAAM,EACN,mBAAmB,EACnB,UAAU,EACV,cAAc,EACd,iBAAiB,EACjB,gBAAgB,GAChB,MAAM,YAAY,CAAC"}

View File

@@ -0,0 +1,2 @@
export { getCookies, toCookieHeader } from './public.js';
//# sourceMappingURL=index.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,cAAc,EAAE,MAAM,aAAa,CAAC"}

View File

@@ -0,0 +1,8 @@
import type { GetCookiesResult } from '../types.js';
export declare function getCookiesFromChrome(options: {
profile?: string;
timeoutMs?: number;
includeExpired?: boolean;
debug?: boolean;
}, origins: string[], allowlistNames: Set<string> | null): Promise<GetCookiesResult>;
//# sourceMappingURL=chrome.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chrome.d.ts","sourceRoot":"","sources":["../../src/providers/chrome.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAU,gBAAgB,EAAE,MAAM,aAAa,CAAC;AAK5D,wBAAsB,oBAAoB,CACzC,OAAO,EAAE;IAAE,OAAO,CAAC,EAAE,MAAM,CAAC;IAAC,SAAS,CAAC,EAAE,MAAM,CAAC;IAAC,cAAc,CAAC,EAAE,OAAO,CAAC;IAAC,KAAK,CAAC,EAAE,OAAO,CAAA;CAAE,EAC5F,OAAO,EAAE,MAAM,EAAE,EACjB,cAAc,EAAE,GAAG,CAAC,MAAM,CAAC,GAAG,IAAI,GAChC,OAAO,CAAC,gBAAgB,CAAC,CA0B3B"}

View File

@@ -0,0 +1,27 @@
import { getCookiesFromChromeSqliteLinux } from './chromeSqliteLinux.js';
import { getCookiesFromChromeSqliteMac } from './chromeSqliteMac.js';
import { getCookiesFromChromeSqliteWindows } from './chromeSqliteWindows.js';
export async function getCookiesFromChrome(options, origins, allowlistNames) {
const warnings = [];
// Platform dispatch only. All real logic lives in the per-OS providers.
if (process.platform === 'darwin') {
const r = await getCookiesFromChromeSqliteMac(options, origins, allowlistNames);
warnings.push(...r.warnings);
const cookies = r.cookies;
return { cookies, warnings };
}
if (process.platform === 'linux') {
const r = await getCookiesFromChromeSqliteLinux(options, origins, allowlistNames);
warnings.push(...r.warnings);
const cookies = r.cookies;
return { cookies, warnings };
}
if (process.platform === 'win32') {
const r = await getCookiesFromChromeSqliteWindows(options, origins, allowlistNames);
warnings.push(...r.warnings);
const cookies = r.cookies;
return { cookies, warnings };
}
return { cookies: [], warnings };
}
//# sourceMappingURL=chrome.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chrome.js","sourceRoot":"","sources":["../../src/providers/chrome.ts"],"names":[],"mappings":"AACA,OAAO,EAAE,+BAA+B,EAAE,MAAM,wBAAwB,CAAC;AACzE,OAAO,EAAE,6BAA6B,EAAE,MAAM,sBAAsB,CAAC;AACrE,OAAO,EAAE,iCAAiC,EAAE,MAAM,0BAA0B,CAAC;AAE7E,MAAM,CAAC,KAAK,UAAU,oBAAoB,CACzC,OAA4F,EAC5F,OAAiB,EACjB,cAAkC;IAElC,MAAM,QAAQ,GAAa,EAAE,CAAC;IAE9B,wEAAwE;IACxE,IAAI,OAAO,CAAC,QAAQ,KAAK,QAAQ,EAAE,CAAC;QACnC,MAAM,CAAC,GAAG,MAAM,6BAA6B,CAAC,OAAO,EAAE,OAAO,EAAE,cAAc,CAAC,CAAC;QAChF,QAAQ,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC,QAAQ,CAAC,CAAC;QAC7B,MAAM,OAAO,GAAa,CAAC,CAAC,OAAO,CAAC;QACpC,OAAO,EAAE,OAAO,EAAE,QAAQ,EAAE,CAAC;IAC9B,CAAC;IAED,IAAI,OAAO,CAAC,QAAQ,KAAK,OAAO,EAAE,CAAC;QAClC,MAAM,CAAC,GAAG,MAAM,+BAA+B,CAAC,OAAO,EAAE,OAAO,EAAE,cAAc,CAAC,CAAC;QAClF,QAAQ,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC,QAAQ,CAAC,CAAC;QAC7B,MAAM,OAAO,GAAa,CAAC,CAAC,OAAO,CAAC;QACpC,OAAO,EAAE,OAAO,EAAE,QAAQ,EAAE,CAAC;IAC9B,CAAC;IAED,IAAI,OAAO,CAAC,QAAQ,KAAK,OAAO,EAAE,CAAC;QAClC,MAAM,CAAC,GAAG,MAAM,iCAAiC,CAAC,OAAO,EAAE,OAAO,EAAE,cAAc,CAAC,CAAC;QACpF,QAAQ,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC,QAAQ,CAAC,CAAC;QAC7B,MAAM,OAAO,GAAa,CAAC,CAAC,OAAO,CAAC;QACpC,OAAO,EAAE,OAAO,EAAE,QAAQ,EAAE,CAAC;IAC9B,CAAC;IAED,OAAO,EAAE,OAAO,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC;AAClC,CAAC"}

View File

@@ -0,0 +1,11 @@
export declare function deriveAes128CbcKeyFromPassword(password: string, options: {
iterations: number;
}): Buffer;
export declare function decryptChromiumAes128CbcCookieValue(encryptedValue: Uint8Array, keyCandidates: readonly Buffer[], options: {
stripHashPrefix: boolean;
treatUnknownPrefixAsPlaintext?: boolean;
}): string | null;
export declare function decryptChromiumAes256GcmCookieValue(encryptedValue: Uint8Array, key: Buffer, options: {
stripHashPrefix: boolean;
}): string | null;
//# sourceMappingURL=crypto.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"crypto.d.ts","sourceRoot":"","sources":["../../../src/providers/chromeSqlite/crypto.ts"],"names":[],"mappings":"AAIA,wBAAgB,8BAA8B,CAC7C,QAAQ,EAAE,MAAM,EAChB,OAAO,EAAE;IAAE,UAAU,EAAE,MAAM,CAAA;CAAE,GAC7B,MAAM,CAIR;AAED,wBAAgB,mCAAmC,CAClD,cAAc,EAAE,UAAU,EAC1B,aAAa,EAAE,SAAS,MAAM,EAAE,EAChC,OAAO,EAAE;IAAE,eAAe,EAAE,OAAO,CAAC;IAAC,6BAA6B,CAAC,EAAE,OAAO,CAAA;CAAE,GAC5E,MAAM,GAAG,IAAI,CA2Bf;AAED,wBAAgB,mCAAmC,CAClD,cAAc,EAAE,UAAU,EAC1B,GAAG,EAAE,MAAM,EACX,OAAO,EAAE;IAAE,eAAe,EAAE,OAAO,CAAA;CAAE,GACnC,MAAM,GAAG,IAAI,CAyBf"}

View File

@@ -0,0 +1,100 @@
import { createDecipheriv, pbkdf2Sync } from 'node:crypto';
const UTF8_DECODER = new TextDecoder('utf-8', { fatal: true });
export function deriveAes128CbcKeyFromPassword(password, options) {
// Chromium derives the AES-128-CBC key from "Chrome Safe Storage" using PBKDF2.
// The salt/length/digest are fixed by Chromium ("saltysalt", 16 bytes, sha1).
return pbkdf2Sync(password, 'saltysalt', options.iterations, 16, 'sha1');
}
export function decryptChromiumAes128CbcCookieValue(encryptedValue, keyCandidates, options) {
const buf = Buffer.from(encryptedValue);
if (buf.length < 3)
return null;
// Chromium prefixes encrypted cookies with `v10`, `v11`, ... (three bytes).
const prefix = buf.subarray(0, 3).toString('utf8');
const hasVersionPrefix = /^v\d\d$/.test(prefix);
if (!hasVersionPrefix) {
// Some platforms (notably macOS) can store plaintext values in `encrypted_value`.
// Callers decide whether unknown prefixes should be treated as plaintext.
if (options.treatUnknownPrefixAsPlaintext === false)
return null;
return decodeCookieValueBytes(buf, false);
}
const ciphertext = buf.subarray(3);
if (!ciphertext.length)
return '';
for (const key of keyCandidates) {
// Try multiple candidates because Linux may fall back to empty passwords depending on keyring state.
const decrypted = tryDecryptAes128Cbc(ciphertext, key);
if (!decrypted)
continue;
const decoded = decodeCookieValueBytes(decrypted, options.stripHashPrefix);
if (decoded !== null)
return decoded;
}
return null;
}
export function decryptChromiumAes256GcmCookieValue(encryptedValue, key, options) {
const buf = Buffer.from(encryptedValue);
if (buf.length < 3)
return null;
const prefix = buf.subarray(0, 3).toString('utf8');
if (!/^v\d\d$/.test(prefix))
return null;
// AES-256-GCM layout:
// - 12-byte nonce
// - ciphertext
// - 16-byte authentication tag
const payload = buf.subarray(3);
if (payload.length < 12 + 16)
return null;
const nonce = payload.subarray(0, 12);
const authenticationTag = payload.subarray(payload.length - 16);
const ciphertext = payload.subarray(12, payload.length - 16);
try {
const decipher = createDecipheriv('aes-256-gcm', key, nonce);
decipher.setAuthTag(authenticationTag);
const plaintext = Buffer.concat([decipher.update(ciphertext), decipher.final()]);
return decodeCookieValueBytes(plaintext, options.stripHashPrefix);
}
catch {
return null;
}
}
function tryDecryptAes128Cbc(ciphertext, key) {
try {
// Chromium's legacy AES-128-CBC uses an IV of 16 spaces.
const iv = Buffer.alloc(16, 0x20);
const decipher = createDecipheriv('aes-128-cbc', key, iv);
decipher.setAutoPadding(false);
const plaintext = Buffer.concat([decipher.update(ciphertext), decipher.final()]);
return removePkcs7Padding(plaintext);
}
catch {
return null;
}
}
function removePkcs7Padding(value) {
if (!value.length)
return value;
const padding = value[value.length - 1];
if (!padding || padding > 16)
return value;
return value.subarray(0, value.length - padding);
}
function decodeCookieValueBytes(value, stripHashPrefix) {
// Chromium >= 24 prepends a 32-byte hash to cookie values.
const bytes = stripHashPrefix && value.length >= 32 ? value.subarray(32) : value;
try {
return stripLeadingControlChars(UTF8_DECODER.decode(bytes));
}
catch {
return null;
}
}
function stripLeadingControlChars(value) {
let i = 0;
while (i < value.length && value.charCodeAt(i) < 0x20)
i += 1;
return value.slice(i);
}
//# sourceMappingURL=crypto.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"crypto.js","sourceRoot":"","sources":["../../../src/providers/chromeSqlite/crypto.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,gBAAgB,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;AAE3D,MAAM,YAAY,GAAG,IAAI,WAAW,CAAC,OAAO,EAAE,EAAE,KAAK,EAAE,IAAI,EAAE,CAAC,CAAC;AAE/D,MAAM,UAAU,8BAA8B,CAC7C,QAAgB,EAChB,OAA+B;IAE/B,gFAAgF;IAChF,8EAA8E;IAC9E,OAAO,UAAU,CAAC,QAAQ,EAAE,WAAW,EAAE,OAAO,CAAC,UAAU,EAAE,EAAE,EAAE,MAAM,CAAC,CAAC;AAC1E,CAAC;AAED,MAAM,UAAU,mCAAmC,CAClD,cAA0B,EAC1B,aAAgC,EAChC,OAA8E;IAE9E,MAAM,GAAG,GAAG,MAAM,CAAC,IAAI,CAAC,cAAc,CAAC,CAAC;IACxC,IAAI,GAAG,CAAC,MAAM,GAAG,CAAC;QAAE,OAAO,IAAI,CAAC;IAEhC,4EAA4E;IAC5E,MAAM,MAAM,GAAG,GAAG,CAAC,QAAQ,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC;IACnD,MAAM,gBAAgB,GAAG,SAAS,CAAC,IAAI,CAAC,MAAM,CAAC,CAAC;IAEhD,IAAI,CAAC,gBAAgB,EAAE,CAAC;QACvB,kFAAkF;QAClF,0EAA0E;QAC1E,IAAI,OAAO,CAAC,6BAA6B,KAAK,KAAK;YAAE,OAAO,IAAI,CAAC;QACjE,OAAO,sBAAsB,CAAC,GAAG,EAAE,KAAK,CAAC,CAAC;IAC3C,CAAC;IAED,MAAM,UAAU,GAAG,GAAG,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC;IACnC,IAAI,CAAC,UAAU,CAAC,MAAM;QAAE,OAAO,EAAE,CAAC;IAElC,KAAK,MAAM,GAAG,IAAI,aAAa,EAAE,CAAC;QACjC,qGAAqG;QACrG,MAAM,SAAS,GAAG,mBAAmB,CAAC,UAAU,EAAE,GAAG,CAAC,CAAC;QACvD,IAAI,CAAC,SAAS;YAAE,SAAS;QACzB,MAAM,OAAO,GAAG,sBAAsB,CAAC,SAAS,EAAE,OAAO,CAAC,eAAe,CAAC,CAAC;QAC3E,IAAI,OAAO,KAAK,IAAI;YAAE,OAAO,OAAO,CAAC;IACtC,CAAC;IAED,OAAO,IAAI,CAAC;AACb,CAAC;AAED,MAAM,UAAU,mCAAmC,CAClD,cAA0B,EAC1B,GAAW,EACX,OAAqC;IAErC,MAAM,GAAG,GAAG,MAAM,CAAC,IAAI,CAAC,cAAc,CAAC,CAAC;IACxC,IAAI,GAAG,CAAC,MAAM,GAAG,CAAC;QAAE,OAAO,IAAI,CAAC;IAChC,MAAM,MAAM,GAAG,GAAG,CAAC,QAAQ,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC;IACnD,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC,MAAM,CAAC;QAAE,OAAO,IAAI,CAAC;IAEzC,sBAAsB;IACtB,kBAAkB;IAClB,eAAe;IACf,+BAA+B;IAC/B,MAAM,OAAO,GAAG,GAAG,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC;IAChC,IAAI,OAAO,CAAC,MAAM,GAAG,EAAE,GAAG,EAAE;QAAE,OAAO,IAAI,CAAC;IAE1C,MAAM,KAAK,GAAG,OAAO,CAAC,QAAQ,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;IACtC,MAAM,iBAAiB,GAAG,OAAO,CAAC,QAAQ,CAAC,OAAO,CAAC,MAAM,GAAG,EAAE,CAAC,CAAC;IAChE,MAAM,UAAU,GAAG,OAAO,CAAC,QAAQ,CAAC,EAAE,EAAE,OAAO,CAAC,MAAM,GAAG,EAAE,CAAC,CAAC;IAE7D,IAAI,CAAC;QACJ,MAAM,QAAQ,GAAG,gBAAgB,CAAC,aAAa,EAAE,GAAG,EAAE,KAAK,CAAC,CAAC;QAC7D,QAAQ,CAAC,UAAU,CAAC,iBAAiB,CAAC,CAAC;QACvC,MAAM,SAAS,GAAG,MAAM,CAAC,MAAM,CAAC,CAAC,QAAQ,CAAC,MAAM,CAAC,UAAU,CAAC,EAAE,QAAQ,CAAC,KAAK,EAAE,CAAC,CAAC,CAAC;QACjF,OAAO,sBAAsB,CAAC,SAAS,EAAE,OAAO,CAAC,eAAe,CAAC,CAAC;IACnE,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,IAAI,CAAC;IACb,CAAC;AACF,CAAC;AAED,SAAS,mBAAmB,CAAC,UAAkB,EAAE,GAAW;IAC3D,IAAI,CAAC;QACJ,yDAAyD;QACzD,MAAM,EAAE,GAAG,MAAM,CAAC,KAAK,CAAC,EAAE,EAAE,IAAI,CAAC,CAAC;QAClC,MAAM,QAAQ,GAAG,gBAAgB,CAAC,aAAa,EAAE,GAAG,EAAE,EAAE,CAAC,CAAC;QAC1D,QAAQ,CAAC,cAAc,CAAC,KAAK,CAAC,CAAC;QAC/B,MAAM,SAAS,GAAG,MAAM,CAAC,MAAM,CAAC,CAAC,QAAQ,CAAC,MAAM,CAAC,UAAU,CAAC,EAAE,QAAQ,CAAC,KAAK,EAAE,CAAC,CAAC,CAAC;QACjF,OAAO,kBAAkB,CAAC,SAAS,CAAC,CAAC;IACtC,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,IAAI,CAAC;IACb,CAAC;AACF,CAAC;AAED,SAAS,kBAAkB,CAAC,KAAa;IACxC,IAAI,CAAC,KAAK,CAAC,MAAM;QAAE,OAAO,KAAK,CAAC;IAChC,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC;IACxC,IAAI,CAAC,OAAO,IAAI,OAAO,GAAG,EAAE;QAAE,OAAO,KAAK,CAAC;IAC3C,OAAO,KAAK,CAAC,QAAQ,CAAC,CAAC,EAAE,KAAK,CAAC,MAAM,GAAG,OAAO,CAAC,CAAC;AAClD,CAAC;AAED,SAAS,sBAAsB,CAAC,KAAa,EAAE,eAAwB;IACtE,2DAA2D;IAC3D,MAAM,KAAK,GAAG,eAAe,IAAI,KAAK,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC,CAAC,KAAK,CAAC,QAAQ,CAAC,EAAE,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC;IACjF,IAAI,CAAC;QACJ,OAAO,wBAAwB,CAAC,YAAY,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC,CAAC;IAC7D,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,IAAI,CAAC;IACb,CAAC;AACF,CAAC;AAED,SAAS,wBAAwB,CAAC,KAAa;IAC9C,IAAI,CAAC,GAAG,CAAC,CAAC;IACV,OAAO,CAAC,GAAG,KAAK,CAAC,MAAM,IAAI,KAAK,CAAC,UAAU,CAAC,CAAC,CAAC,GAAG,IAAI;QAAE,CAAC,IAAI,CAAC,CAAC;IAC9D,OAAO,KAAK,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;AACvB,CAAC"}

View File

@@ -0,0 +1,25 @@
export type LinuxKeyringBackend = 'gnome' | 'kwallet' | 'basic';
/**
* Read the "Safe Storage" password from a Linux keyring.
*
* Chromium browsers typically store their cookie encryption password under:
* - service: "<Browser> Safe Storage"
* - account: "<Browser>"
*
* We keep this logic in JS (no native deps) and return an empty password on failure
* (Chromium may still have v10 cookies, and callers can use inline/export escape hatches).
*/
export declare function getLinuxChromiumSafeStoragePassword(options: {
backend?: LinuxKeyringBackend;
app: 'chrome' | 'edge';
}): Promise<{
password: string;
warnings: string[];
}>;
export declare function getLinuxChromeSafeStoragePassword(options?: {
backend?: LinuxKeyringBackend;
}): Promise<{
password: string;
warnings: string[];
}>;
//# sourceMappingURL=linuxKeyring.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"linuxKeyring.d.ts","sourceRoot":"","sources":["../../../src/providers/chromeSqlite/linuxKeyring.ts"],"names":[],"mappings":"AAEA,MAAM,MAAM,mBAAmB,GAAG,OAAO,GAAG,SAAS,GAAG,OAAO,CAAC;AAEhE;;;;;;;;;GASG;AACH,wBAAsB,mCAAmC,CAAC,OAAO,EAAE;IAClE,OAAO,CAAC,EAAE,mBAAmB,CAAC;IAC9B,GAAG,EAAE,QAAQ,GAAG,MAAM,CAAC;CACvB,GAAG,OAAO,CAAC;IAAE,QAAQ,EAAE,MAAM,CAAC;IAAC,QAAQ,EAAE,MAAM,EAAE,CAAA;CAAE,CAAC,CA8DpD;AAED,wBAAsB,iCAAiC,CACtD,OAAO,GAAE;IAAE,OAAO,CAAC,EAAE,mBAAmB,CAAA;CAAO,GAC7C,OAAO,CAAC;IAAE,QAAQ,EAAE,MAAM,CAAC;IAAC,QAAQ,EAAE,MAAM,EAAE,CAAA;CAAE,CAAC,CAInD"}

View File

@@ -0,0 +1,104 @@
import { execCapture } from '../../util/exec.js';
/**
* Read the "Safe Storage" password from a Linux keyring.
*
* Chromium browsers typically store their cookie encryption password under:
* - service: "<Browser> Safe Storage"
* - account: "<Browser>"
*
* We keep this logic in JS (no native deps) and return an empty password on failure
* (Chromium may still have v10 cookies, and callers can use inline/export escape hatches).
*/
export async function getLinuxChromiumSafeStoragePassword(options) {
const warnings = [];
// Escape hatch: if callers already know the password (or want deterministic CI behavior),
// they can bypass keyring probing entirely.
const overrideKey = options.app === 'edge'
? 'SWEET_COOKIE_EDGE_SAFE_STORAGE_PASSWORD'
: 'SWEET_COOKIE_CHROME_SAFE_STORAGE_PASSWORD';
const override = readEnv(overrideKey);
if (override !== undefined)
return { password: override, warnings };
const backend = options.backend ?? parseLinuxKeyringBackend() ?? chooseLinuxKeyringBackend();
// `basic` means "don't try keyrings" (Chrome will fall back to older/less-secure schemes on some setups).
if (backend === 'basic')
return { password: '', warnings };
const service = options.app === 'edge' ? 'Microsoft Edge Safe Storage' : 'Chrome Safe Storage';
const account = options.app === 'edge' ? 'Microsoft Edge' : 'Chrome';
const folder = `${account} Keys`;
if (backend === 'gnome') {
// GNOME keyring: `secret-tool` is the simplest way to read libsecret entries.
const res = await execCapture('secret-tool', ['lookup', 'service', service, 'account', account], { timeoutMs: 3_000 });
if (res.code === 0)
return { password: res.stdout.trim(), warnings };
warnings.push('Failed to read Linux keyring via secret-tool; v11 cookies may be unavailable.');
return { password: '', warnings };
}
// KDE keyring: query KWallet via `kwallet-query`, but the wallet name differs across KDE versions.
const kdeVersion = (readEnv('KDE_SESSION_VERSION') ?? '').trim();
const serviceName = kdeVersion === '6'
? 'org.kde.kwalletd6'
: kdeVersion === '5'
? 'org.kde.kwalletd5'
: 'org.kde.kwalletd';
const walletPath = kdeVersion === '6'
? '/modules/kwalletd6'
: kdeVersion === '5'
? '/modules/kwalletd5'
: '/modules/kwalletd';
const wallet = await getKWalletNetworkWallet(serviceName, walletPath);
const passwordRes = await execCapture('kwallet-query', ['--read-password', service, '--folder', folder, wallet], { timeoutMs: 3_000 });
if (passwordRes.code !== 0) {
warnings.push('Failed to read Linux keyring via kwallet-query; v11 cookies may be unavailable.');
return { password: '', warnings };
}
if (passwordRes.stdout.toLowerCase().startsWith('failed to read'))
return { password: '', warnings };
return { password: passwordRes.stdout.trim(), warnings };
}
export async function getLinuxChromeSafeStoragePassword(options = {}) {
const args = { app: 'chrome' };
if (options.backend !== undefined)
args.backend = options.backend;
return await getLinuxChromiumSafeStoragePassword(args);
}
function parseLinuxKeyringBackend() {
const raw = readEnv('SWEET_COOKIE_LINUX_KEYRING');
if (!raw)
return undefined;
const normalized = raw.toLowerCase();
if (normalized === 'gnome')
return 'gnome';
if (normalized === 'kwallet')
return 'kwallet';
if (normalized === 'basic')
return 'basic';
return undefined;
}
function chooseLinuxKeyringBackend() {
const xdg = readEnv('XDG_CURRENT_DESKTOP') ?? '';
const isKde = xdg.split(':').some((p) => p.trim().toLowerCase() === 'kde') || !!readEnv('KDE_FULL_SESSION');
return isKde ? 'kwallet' : 'gnome';
}
async function getKWalletNetworkWallet(serviceName, walletPath) {
const res = await execCapture('dbus-send', [
'--session',
'--print-reply=literal',
`--dest=${serviceName}`,
walletPath,
'org.kde.KWallet.networkWallet',
], { timeoutMs: 3_000 });
const fallback = 'kdewallet';
if (res.code !== 0)
return fallback;
const raw = res.stdout.trim();
if (!raw)
return fallback;
return raw.replaceAll('"', '').trim() || fallback;
}
function readEnv(key) {
const value = process.env[key];
const trimmed = typeof value === 'string' ? value.trim() : '';
return trimmed.length ? trimmed : undefined;
}
//# sourceMappingURL=linuxKeyring.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"linuxKeyring.js","sourceRoot":"","sources":["../../../src/providers/chromeSqlite/linuxKeyring.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,WAAW,EAAE,MAAM,oBAAoB,CAAC;AAIjD;;;;;;;;;GASG;AACH,MAAM,CAAC,KAAK,UAAU,mCAAmC,CAAC,OAGzD;IACA,MAAM,QAAQ,GAAa,EAAE,CAAC;IAE9B,0FAA0F;IAC1F,4CAA4C;IAC5C,MAAM,WAAW,GAChB,OAAO,CAAC,GAAG,KAAK,MAAM;QACrB,CAAC,CAAC,yCAAyC;QAC3C,CAAC,CAAC,2CAA2C,CAAC;IAChD,MAAM,QAAQ,GAAG,OAAO,CAAC,WAAW,CAAC,CAAC;IACtC,IAAI,QAAQ,KAAK,SAAS;QAAE,OAAO,EAAE,QAAQ,EAAE,QAAQ,EAAE,QAAQ,EAAE,CAAC;IAEpE,MAAM,OAAO,GAAG,OAAO,CAAC,OAAO,IAAI,wBAAwB,EAAE,IAAI,yBAAyB,EAAE,CAAC;IAC7F,0GAA0G;IAC1G,IAAI,OAAO,KAAK,OAAO;QAAE,OAAO,EAAE,QAAQ,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC;IAE3D,MAAM,OAAO,GAAG,OAAO,CAAC,GAAG,KAAK,MAAM,CAAC,CAAC,CAAC,6BAA6B,CAAC,CAAC,CAAC,qBAAqB,CAAC;IAC/F,MAAM,OAAO,GAAG,OAAO,CAAC,GAAG,KAAK,MAAM,CAAC,CAAC,CAAC,gBAAgB,CAAC,CAAC,CAAC,QAAQ,CAAC;IACrE,MAAM,MAAM,GAAG,GAAG,OAAO,OAAO,CAAC;IAEjC,IAAI,OAAO,KAAK,OAAO,EAAE,CAAC;QACzB,8EAA8E;QAC9E,MAAM,GAAG,GAAG,MAAM,WAAW,CAC5B,aAAa,EACb,CAAC,QAAQ,EAAE,SAAS,EAAE,OAAO,EAAE,SAAS,EAAE,OAAO,CAAC,EAClD,EAAE,SAAS,EAAE,KAAK,EAAE,CACpB,CAAC;QACF,IAAI,GAAG,CAAC,IAAI,KAAK,CAAC;YAAE,OAAO,EAAE,QAAQ,EAAE,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,EAAE,QAAQ,EAAE,CAAC;QACrE,QAAQ,CAAC,IAAI,CAAC,+EAA+E,CAAC,CAAC;QAC/F,OAAO,EAAE,QAAQ,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC;IACnC,CAAC;IAED,mGAAmG;IACnG,MAAM,UAAU,GAAG,CAAC,OAAO,CAAC,qBAAqB,CAAC,IAAI,EAAE,CAAC,CAAC,IAAI,EAAE,CAAC;IACjE,MAAM,WAAW,GAChB,UAAU,KAAK,GAAG;QACjB,CAAC,CAAC,mBAAmB;QACrB,CAAC,CAAC,UAAU,KAAK,GAAG;YACnB,CAAC,CAAC,mBAAmB;YACrB,CAAC,CAAC,kBAAkB,CAAC;IACxB,MAAM,UAAU,GACf,UAAU,KAAK,GAAG;QACjB,CAAC,CAAC,oBAAoB;QACtB,CAAC,CAAC,UAAU,KAAK,GAAG;YACnB,CAAC,CAAC,oBAAoB;YACtB,CAAC,CAAC,mBAAmB,CAAC;IAEzB,MAAM,MAAM,GAAG,MAAM,uBAAuB,CAAC,WAAW,EAAE,UAAU,CAAC,CAAC;IACtE,MAAM,WAAW,GAAG,MAAM,WAAW,CACpC,eAAe,EACf,CAAC,iBAAiB,EAAE,OAAO,EAAE,UAAU,EAAE,MAAM,EAAE,MAAM,CAAC,EACxD,EAAE,SAAS,EAAE,KAAK,EAAE,CACpB,CAAC;IACF,IAAI,WAAW,CAAC,IAAI,KAAK,CAAC,EAAE,CAAC;QAC5B,QAAQ,CAAC,IAAI,CACZ,iFAAiF,CACjF,CAAC;QACF,OAAO,EAAE,QAAQ,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC;IACnC,CAAC;IACD,IAAI,WAAW,CAAC,MAAM,CAAC,WAAW,EAAE,CAAC,UAAU,CAAC,gBAAgB,CAAC;QAChE,OAAO,EAAE,QAAQ,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC;IACnC,OAAO,EAAE,QAAQ,EAAE,WAAW,CAAC,MAAM,CAAC,IAAI,EAAE,EAAE,QAAQ,EAAE,CAAC;AAC1D,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,iCAAiC,CACtD,UAA6C,EAAE;IAE/C,MAAM,IAAI,GAAqD,EAAE,GAAG,EAAE,QAAQ,EAAE,CAAC;IACjF,IAAI,OAAO,CAAC,OAAO,KAAK,SAAS;QAAE,IAAI,CAAC,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;IAClE,OAAO,MAAM,mCAAmC,CAAC,IAAI,CAAC,CAAC;AACxD,CAAC;AAED,SAAS,wBAAwB;IAChC,MAAM,GAAG,GAAG,OAAO,CAAC,4BAA4B,CAAC,CAAC;IAClD,IAAI,CAAC,GAAG;QAAE,OAAO,SAAS,CAAC;IAC3B,MAAM,UAAU,GAAG,GAAG,CAAC,WAAW,EAAE,CAAC;IACrC,IAAI,UAAU,KAAK,OAAO;QAAE,OAAO,OAAO,CAAC;IAC3C,IAAI,UAAU,KAAK,SAAS;QAAE,OAAO,SAAS,CAAC;IAC/C,IAAI,UAAU,KAAK,OAAO;QAAE,OAAO,OAAO,CAAC;IAC3C,OAAO,SAAS,CAAC;AAClB,CAAC;AAED,SAAS,yBAAyB;IACjC,MAAM,GAAG,GAAG,OAAO,CAAC,qBAAqB,CAAC,IAAI,EAAE,CAAC;IACjD,MAAM,KAAK,GACV,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC,WAAW,EAAE,KAAK,KAAK,CAAC,IAAI,CAAC,CAAC,OAAO,CAAC,kBAAkB,CAAC,CAAC;IAC/F,OAAO,KAAK,CAAC,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,OAAO,CAAC;AACpC,CAAC;AAED,KAAK,UAAU,uBAAuB,CAAC,WAAmB,EAAE,UAAkB;IAC7E,MAAM,GAAG,GAAG,MAAM,WAAW,CAC5B,WAAW,EACX;QACC,WAAW;QACX,uBAAuB;QACvB,UAAU,WAAW,EAAE;QACvB,UAAU;QACV,+BAA+B;KAC/B,EACD,EAAE,SAAS,EAAE,KAAK,EAAE,CACpB,CAAC;IACF,MAAM,QAAQ,GAAG,WAAW,CAAC;IAC7B,IAAI,GAAG,CAAC,IAAI,KAAK,CAAC;QAAE,OAAO,QAAQ,CAAC;IACpC,MAAM,GAAG,GAAG,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,CAAC;IAC9B,IAAI,CAAC,GAAG;QAAE,OAAO,QAAQ,CAAC;IAC1B,OAAO,GAAG,CAAC,UAAU,CAAC,GAAG,EAAE,EAAE,CAAC,CAAC,IAAI,EAAE,IAAI,QAAQ,CAAC;AACnD,CAAC;AAED,SAAS,OAAO,CAAC,GAAW;IAC3B,MAAM,KAAK,GAAG,OAAO,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC;IAC/B,MAAM,OAAO,GAAG,OAAO,KAAK,KAAK,QAAQ,CAAC,CAAC,CAAC,KAAK,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,EAAE,CAAC;IAC9D,OAAO,OAAO,CAAC,MAAM,CAAC,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,SAAS,CAAC;AAC7C,CAAC"}

View File

@@ -0,0 +1,10 @@
import type { GetCookiesResult } from '../../types.js';
export declare function getCookiesFromChromeSqliteDb(options: {
dbPath: string;
profile?: string;
includeExpired?: boolean;
debug?: boolean;
}, origins: string[], allowlistNames: Set<string> | null, decrypt: (encryptedValue: Uint8Array, options: {
stripHashPrefix: boolean;
}) => string | null): Promise<GetCookiesResult>;
//# sourceMappingURL=shared.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"shared.d.ts","sourceRoot":"","sources":["../../../src/providers/chromeSqlite/shared.ts"],"names":[],"mappings":"AAIA,OAAO,KAAK,EAA0B,gBAAgB,EAAE,MAAM,gBAAgB,CAAC;AAkB/E,wBAAsB,4BAA4B,CACjD,OAAO,EAAE;IAAE,MAAM,EAAE,MAAM,CAAC;IAAC,OAAO,CAAC,EAAE,MAAM,CAAC;IAAC,cAAc,CAAC,EAAE,OAAO,CAAC;IAAC,KAAK,CAAC,EAAE,OAAO,CAAA;CAAE,EACxF,OAAO,EAAE,MAAM,EAAE,EACjB,cAAc,EAAE,GAAG,CAAC,MAAM,CAAC,GAAG,IAAI,EAClC,OAAO,EAAE,CAAC,cAAc,EAAE,UAAU,EAAE,OAAO,EAAE;IAAE,eAAe,EAAE,OAAO,CAAA;CAAE,KAAK,MAAM,GAAG,IAAI,GAC3F,OAAO,CAAC,gBAAgB,CAAC,CAsD3B"}

View File

@@ -0,0 +1,293 @@
import { copyFileSync, existsSync, mkdtempSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import path from 'node:path';
import { normalizeExpiration } from '../../util/expire.js';
import { hostMatchesCookieDomain } from '../../util/hostMatch.js';
import { importNodeSqlite, supportsReadBigInts } from '../../util/nodeSqlite.js';
import { isBunRuntime } from '../../util/runtime.js';
export async function getCookiesFromChromeSqliteDb(options, origins, allowlistNames, decrypt) {
const warnings = [];
// Chrome can keep its cookie DB locked and/or rely on WAL sidecars.
// Copying to a temp dir gives us a stable snapshot that both node:sqlite and bun:sqlite can open.
const tempDir = mkdtempSync(path.join(tmpdir(), 'sweet-cookie-chrome-'));
const tempDbPath = path.join(tempDir, 'Cookies');
try {
copyFileSync(options.dbPath, tempDbPath);
// If WAL is enabled, the latest writes might live in `Cookies-wal`/`Cookies-shm`.
// Copy them too when present so our snapshot reflects the current browser state.
copySidecar(options.dbPath, `${tempDbPath}-wal`, '-wal');
copySidecar(options.dbPath, `${tempDbPath}-shm`, '-shm');
}
catch (error) {
rmSync(tempDir, { recursive: true, force: true });
warnings.push(`Failed to copy Chrome cookie DB: ${error instanceof Error ? error.message : String(error)}`);
return { cookies: [], warnings };
}
try {
const hosts = origins.map((o) => new URL(o).hostname);
const where = buildHostWhereClause(hosts, 'host_key');
const metaVersion = await readChromiumMetaVersion(tempDbPath);
// Chromium >= 24 stores a 32-byte hash prefix in decrypted cookie values.
// We detect this via the `meta` table version and strip it when present.
const stripHashPrefix = metaVersion >= 24;
const rowsResult = await readChromeRows(tempDbPath, where);
if (!rowsResult.ok) {
warnings.push(rowsResult.error);
return { cookies: [], warnings };
}
const collectOptions = {};
if (options.profile)
collectOptions.profile = options.profile;
if (options.includeExpired !== undefined)
collectOptions.includeExpired = options.includeExpired;
const cookies = collectChromeCookiesFromRows(rowsResult.rows, collectOptions, hosts, allowlistNames, (encryptedValue) => decrypt(encryptedValue, { stripHashPrefix }), warnings);
return { cookies: dedupeCookies(cookies), warnings };
}
finally {
rmSync(tempDir, { recursive: true, force: true });
}
}
function collectChromeCookiesFromRows(rows, options, hosts, allowlistNames, decrypt, warnings) {
const cookies = [];
const now = Math.floor(Date.now() / 1000);
let warnedEncryptedType = false;
for (const row of rows) {
const name = typeof row.name === 'string' ? row.name : null;
if (!name)
continue;
if (allowlistNames && allowlistNames.size > 0 && !allowlistNames.has(name))
continue;
const hostKey = typeof row.host_key === 'string' ? row.host_key : null;
if (!hostKey)
continue;
if (!hostMatchesAny(hosts, hostKey))
continue;
const rowPath = typeof row.path === 'string' ? row.path : '';
const valueString = typeof row.value === 'string' ? row.value : null;
let value = valueString;
if (value === null || value.length === 0) {
// Many modern Chromium cookies keep `value` empty and only store `encrypted_value`.
// We decrypt on demand and drop rows we can't interpret.
const encryptedBytes = getEncryptedBytes(row);
if (!encryptedBytes) {
if (!warnedEncryptedType && row.encrypted_value !== undefined) {
warnings.push('Chrome cookie encrypted_value is in an unsupported type.');
warnedEncryptedType = true;
}
continue;
}
value = decrypt(encryptedBytes);
}
if (value === null)
continue;
const expiresRaw = typeof row.expires_utc === 'number' || typeof row.expires_utc === 'bigint'
? row.expires_utc
: tryParseInt(row.expires_utc);
const expires = normalizeExpiration(expiresRaw ?? undefined);
if (!options.includeExpired) {
if (expires && expires < now)
continue;
}
const secure = row.is_secure === 1 ||
row.is_secure === 1n ||
row.is_secure === '1' ||
row.is_secure === true;
const httpOnly = row.is_httponly === 1 ||
row.is_httponly === 1n ||
row.is_httponly === '1' ||
row.is_httponly === true;
const sameSite = normalizeChromiumSameSite(row.samesite);
const source = { browser: 'chrome' };
if (options.profile)
source.profile = options.profile;
const cookie = {
name,
value,
domain: hostKey.startsWith('.') ? hostKey.slice(1) : hostKey,
path: rowPath || '/',
secure,
httpOnly,
source,
};
if (expires !== undefined)
cookie.expires = expires;
if (sameSite !== undefined)
cookie.sameSite = sameSite;
cookies.push(cookie);
}
return cookies;
}
function tryParseInt(value) {
if (typeof value === 'bigint') {
const parsed = Number(value);
return Number.isFinite(parsed) ? parsed : null;
}
if (typeof value !== 'string')
return null;
const parsed = Number.parseInt(value, 10);
return Number.isFinite(parsed) ? parsed : null;
}
function normalizeChromiumSameSite(value) {
if (typeof value === 'bigint') {
const parsed = Number(value);
return Number.isFinite(parsed) ? normalizeChromiumSameSite(parsed) : undefined;
}
if (typeof value === 'number') {
if (value === 2)
return 'Strict';
if (value === 1)
return 'Lax';
if (value === 0)
return 'None';
return undefined;
}
if (typeof value === 'string') {
const parsed = Number.parseInt(value, 10);
if (Number.isFinite(parsed))
return normalizeChromiumSameSite(parsed);
const normalized = value.toLowerCase();
if (normalized === 'strict')
return 'Strict';
if (normalized === 'lax')
return 'Lax';
if (normalized === 'none' || normalized === 'no_restriction')
return 'None';
}
return undefined;
}
function getEncryptedBytes(row) {
const raw = row.encrypted_value;
if (raw instanceof Uint8Array)
return raw;
return null;
}
async function readChromiumMetaVersion(dbPath) {
const sql = `SELECT value FROM meta WHERE key = 'version'`;
const result = isBunRuntime()
? await queryNodeOrBun({ kind: 'bun', dbPath, sql })
: await queryNodeOrBun({ kind: 'node', dbPath, sql });
if (!result.ok)
return 0;
const first = result.rows[0];
const value = first?.value;
if (typeof value === 'number')
return Math.floor(value);
if (typeof value === 'bigint') {
const parsed = Number(value);
return Number.isFinite(parsed) ? Math.floor(parsed) : 0;
}
if (typeof value === 'string') {
const parsed = Number.parseInt(value, 10);
return Number.isFinite(parsed) ? parsed : 0;
}
return 0;
}
async function readChromeRows(dbPath, where) {
const sqliteKind = isBunRuntime() ? 'bun' : 'node';
const sqliteLabel = sqliteKind === 'bun' ? 'bun:sqlite' : 'node:sqlite';
const sql = `SELECT name, value, host_key, path, expires_utc, samesite, encrypted_value, ` +
`is_secure AS is_secure, is_httponly AS is_httponly ` +
`FROM cookies WHERE (${where}) ORDER BY expires_utc DESC;`;
const result = await queryNodeOrBun({ kind: sqliteKind, dbPath, sql });
if (result.ok)
return { ok: true, rows: result.rows };
// Intentionally strict: only support modern Chromium cookie DB schemas.
// If this fails, assume the local Chrome/Chromium is too old or uses a non-standard schema.
return {
ok: false,
error: `${sqliteLabel} failed reading Chrome cookies (requires modern Chromium, e.g. Chrome >= 100): ${result.error}`,
};
}
async function queryNodeOrBun(options) {
try {
if (options.kind === 'node') {
// Node's `node:sqlite` is synchronous and returns plain JS values. Keep it boxed in a
// small scope so callers don't need to care about runtime differences.
const { DatabaseSync } = await importNodeSqlite();
const dbOptions = { readOnly: true };
if (supportsReadBigInts()) {
dbOptions.readBigInts = true;
}
const db = new DatabaseSync(options.dbPath, dbOptions);
try {
const rows = db.prepare(options.sql).all();
return { ok: true, rows };
}
finally {
db.close();
}
}
// Bun's sqlite API has a different surface (`Database` + `.query().all()`).
const { Database } = await import('bun:sqlite');
const db = new Database(options.dbPath, { readonly: true });
try {
const rows = db.query(options.sql).all();
return { ok: true, rows };
}
finally {
db.close();
}
}
catch (error) {
return { ok: false, error: error instanceof Error ? error.message : String(error) };
}
}
function copySidecar(sourceDbPath, target, suffix) {
const sidecar = `${sourceDbPath}${suffix}`;
if (!existsSync(sidecar))
return;
try {
copyFileSync(sidecar, target);
}
catch {
// ignore
}
}
function buildHostWhereClause(hosts, column) {
const clauses = [];
for (const host of hosts) {
// Chrome cookies often live on parent domains (e.g. .google.com for gemini.google.com).
// Include parent domains so the SQL filter doesn't drop valid session cookies.
for (const candidate of expandHostCandidates(host)) {
const escaped = sqlLiteral(candidate);
const escapedDot = sqlLiteral(`.${candidate}`);
const escapedLike = sqlLiteral(`%.${candidate}`);
clauses.push(`${column} = ${escaped}`);
clauses.push(`${column} = ${escapedDot}`);
clauses.push(`${column} LIKE ${escapedLike}`);
}
}
return clauses.length ? clauses.join(' OR ') : '1=0';
}
function sqlLiteral(value) {
const escaped = value.replaceAll("'", "''");
return `'${escaped}'`;
}
function expandHostCandidates(host) {
const parts = host.split('.').filter(Boolean);
if (parts.length <= 1)
return [host];
const candidates = new Set();
candidates.add(host);
// Include parent domains down to two labels (avoid TLD-only fragments).
for (let i = 1; i <= parts.length - 2; i += 1) {
const candidate = parts.slice(i).join('.');
if (candidate)
candidates.add(candidate);
}
return Array.from(candidates);
}
function hostMatchesAny(hosts, cookieHost) {
const cookieDomain = cookieHost.startsWith('.') ? cookieHost.slice(1) : cookieHost;
return hosts.some((host) => hostMatchesCookieDomain(host, cookieDomain));
}
function dedupeCookies(cookies) {
const merged = new Map();
for (const cookie of cookies) {
const key = `${cookie.name}|${cookie.domain ?? ''}|${cookie.path ?? ''}`;
if (!merged.has(key))
merged.set(key, cookie);
}
return Array.from(merged.values());
}
//# sourceMappingURL=shared.js.map

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,10 @@
export declare function dpapiUnprotect(data: Buffer, options?: {
timeoutMs?: number;
}): Promise<{
ok: true;
value: Buffer;
} | {
ok: false;
error: string;
}>;
//# sourceMappingURL=windowsDpapi.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"windowsDpapi.d.ts","sourceRoot":"","sources":["../../../src/providers/chromeSqlite/windowsDpapi.ts"],"names":[],"mappings":"AAEA,wBAAsB,cAAc,CACnC,IAAI,EAAE,MAAM,EACZ,OAAO,GAAE;IAAE,SAAS,CAAC,EAAE,MAAM,CAAA;CAAO,GAClC,OAAO,CAAC;IAAE,EAAE,EAAE,IAAI,CAAC;IAAC,KAAK,EAAE,MAAM,CAAA;CAAE,GAAG;IAAE,EAAE,EAAE,KAAK,CAAC;IAAC,KAAK,EAAE,MAAM,CAAA;CAAE,CAAC,CA+BrE"}

View File

@@ -0,0 +1,26 @@
import { execCapture } from '../../util/exec.js';
export async function dpapiUnprotect(data, options = {}) {
const timeoutMs = options.timeoutMs ?? 5_000;
// There is no cross-platform JS API for Windows DPAPI, and we explicitly avoid native addons.
// PowerShell can call ProtectedData.Unprotect for the current user, which matches Chrome's behavior.
const inputB64 = data.toString('base64');
const prelude = 'try { Add-Type -AssemblyName System.Security.Cryptography.ProtectedData -ErrorAction Stop } catch { try { Add-Type -AssemblyName System.Security -ErrorAction Stop } catch {} };';
const script = prelude +
`$in=[Convert]::FromBase64String('${inputB64}');` +
`$out=[System.Security.Cryptography.ProtectedData]::Unprotect($in,$null,[System.Security.Cryptography.DataProtectionScope]::CurrentUser);` +
`[Convert]::ToBase64String($out)`;
const res = await execCapture('powershell', ['-NoProfile', '-NonInteractive', '-Command', script], {
timeoutMs,
});
if (res.code !== 0) {
return { ok: false, error: res.stderr.trim() || `powershell exit ${res.code}` };
}
try {
const out = Buffer.from(res.stdout.trim(), 'base64');
return { ok: true, value: out };
}
catch (error) {
return { ok: false, error: error instanceof Error ? error.message : String(error) };
}
}
//# sourceMappingURL=windowsDpapi.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"windowsDpapi.js","sourceRoot":"","sources":["../../../src/providers/chromeSqlite/windowsDpapi.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,WAAW,EAAE,MAAM,oBAAoB,CAAC;AAEjD,MAAM,CAAC,KAAK,UAAU,cAAc,CACnC,IAAY,EACZ,UAAkC,EAAE;IAEpC,MAAM,SAAS,GAAG,OAAO,CAAC,SAAS,IAAI,KAAK,CAAC;IAE7C,8FAA8F;IAC9F,qGAAqG;IACrG,MAAM,QAAQ,GAAG,IAAI,CAAC,QAAQ,CAAC,QAAQ,CAAC,CAAC;IACzC,MAAM,OAAO,GACZ,kLAAkL,CAAC;IACpL,MAAM,MAAM,GACX,OAAO;QACP,oCAAoC,QAAQ,KAAK;QACjD,0IAA0I;QAC1I,iCAAiC,CAAC;IAEnC,MAAM,GAAG,GAAG,MAAM,WAAW,CAC5B,YAAY,EACZ,CAAC,YAAY,EAAE,iBAAiB,EAAE,UAAU,EAAE,MAAM,CAAC,EACrD;QACC,SAAS;KACT,CACD,CAAC;IACF,IAAI,GAAG,CAAC,IAAI,KAAK,CAAC,EAAE,CAAC;QACpB,OAAO,EAAE,EAAE,EAAE,KAAK,EAAE,KAAK,EAAE,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,IAAI,mBAAmB,GAAG,CAAC,IAAI,EAAE,EAAE,CAAC;IACjF,CAAC;IAED,IAAI,CAAC;QACJ,MAAM,GAAG,GAAG,MAAM,CAAC,IAAI,CAAC,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,EAAE,QAAQ,CAAC,CAAC;QACrD,OAAO,EAAE,EAAE,EAAE,IAAI,EAAE,KAAK,EAAE,GAAG,EAAE,CAAC;IACjC,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QAChB,OAAO,EAAE,EAAE,EAAE,KAAK,EAAE,KAAK,EAAE,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC,EAAE,CAAC;IACrF,CAAC;AACF,CAAC"}

View File

@@ -0,0 +1,7 @@
import type { GetCookiesResult } from '../types.js';
export declare function getCookiesFromChromeSqliteLinux(options: {
profile?: string;
includeExpired?: boolean;
debug?: boolean;
}, origins: string[], allowlistNames: Set<string> | null): Promise<GetCookiesResult>;
//# sourceMappingURL=chromeSqliteLinux.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chromeSqliteLinux.d.ts","sourceRoot":"","sources":["../../src/providers/chromeSqliteLinux.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,gBAAgB,EAAE,MAAM,aAAa,CAAC;AASpD,wBAAsB,+BAA+B,CACpD,OAAO,EAAE;IAAE,OAAO,CAAC,EAAE,MAAM,CAAC;IAAC,cAAc,CAAC,EAAE,OAAO,CAAC;IAAC,KAAK,CAAC,EAAE,OAAO,CAAA;CAAE,EACxE,OAAO,EAAE,MAAM,EAAE,EACjB,cAAc,EAAE,GAAG,CAAC,MAAM,CAAC,GAAG,IAAI,GAChC,OAAO,CAAC,gBAAgB,CAAC,CAkD3B"}

View File

@@ -0,0 +1,51 @@
import { decryptChromiumAes128CbcCookieValue, deriveAes128CbcKeyFromPassword, } from './chromeSqlite/crypto.js';
import { getLinuxChromeSafeStoragePassword } from './chromeSqlite/linuxKeyring.js';
import { getCookiesFromChromeSqliteDb } from './chromeSqlite/shared.js';
import { resolveChromiumCookiesDbLinux } from './chromium/linuxPaths.js';
export async function getCookiesFromChromeSqliteLinux(options, origins, allowlistNames) {
const args = {
configDirName: 'google-chrome',
};
if (options.profile !== undefined)
args.profile = options.profile;
const dbPath = resolveChromiumCookiesDbLinux(args);
if (!dbPath) {
return { cookies: [], warnings: ['Chrome cookies database not found.'] };
}
const { password, warnings: keyringWarnings } = await getLinuxChromeSafeStoragePassword();
// Linux uses multiple schemes depending on distro/keyring availability.
// - v10 often uses the hard-coded "peanuts" password
// - v11 uses "Chrome Safe Storage" from the keyring (may be empty/unavailable)
const v10Key = deriveAes128CbcKeyFromPassword('peanuts', { iterations: 1 });
const emptyKey = deriveAes128CbcKeyFromPassword('', { iterations: 1 });
const v11Key = deriveAes128CbcKeyFromPassword(password, { iterations: 1 });
const decrypt = (encryptedValue, opts) => {
const prefix = Buffer.from(encryptedValue).subarray(0, 3).toString('utf8');
if (prefix === 'v10') {
return decryptChromiumAes128CbcCookieValue(encryptedValue, [v10Key, emptyKey], {
stripHashPrefix: opts.stripHashPrefix,
treatUnknownPrefixAsPlaintext: false,
});
}
if (prefix === 'v11') {
return decryptChromiumAes128CbcCookieValue(encryptedValue, [v11Key, emptyKey], {
stripHashPrefix: opts.stripHashPrefix,
treatUnknownPrefixAsPlaintext: false,
});
}
return null;
};
const dbOptions = {
dbPath,
};
if (options.profile)
dbOptions.profile = options.profile;
if (options.includeExpired !== undefined)
dbOptions.includeExpired = options.includeExpired;
if (options.debug !== undefined)
dbOptions.debug = options.debug;
const result = await getCookiesFromChromeSqliteDb(dbOptions, origins, allowlistNames, decrypt);
result.warnings.unshift(...keyringWarnings);
return result;
}
//# sourceMappingURL=chromeSqliteLinux.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chromeSqliteLinux.js","sourceRoot":"","sources":["../../src/providers/chromeSqliteLinux.ts"],"names":[],"mappings":"AACA,OAAO,EACN,mCAAmC,EACnC,8BAA8B,GAC9B,MAAM,0BAA0B,CAAC;AAClC,OAAO,EAAE,iCAAiC,EAAE,MAAM,gCAAgC,CAAC;AACnF,OAAO,EAAE,4BAA4B,EAAE,MAAM,0BAA0B,CAAC;AACxE,OAAO,EAAE,6BAA6B,EAAE,MAAM,0BAA0B,CAAC;AAEzE,MAAM,CAAC,KAAK,UAAU,+BAA+B,CACpD,OAAwE,EACxE,OAAiB,EACjB,cAAkC;IAElC,MAAM,IAAI,GAAwD;QACjE,aAAa,EAAE,eAAe;KAC9B,CAAC;IACF,IAAI,OAAO,CAAC,OAAO,KAAK,SAAS;QAAE,IAAI,CAAC,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;IAClE,MAAM,MAAM,GAAG,6BAA6B,CAAC,IAAI,CAAC,CAAC;IACnD,IAAI,CAAC,MAAM,EAAE,CAAC;QACb,OAAO,EAAE,OAAO,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC,oCAAoC,CAAC,EAAE,CAAC;IAC1E,CAAC;IAED,MAAM,EAAE,QAAQ,EAAE,QAAQ,EAAE,eAAe,EAAE,GAAG,MAAM,iCAAiC,EAAE,CAAC;IAE1F,wEAAwE;IACxE,qDAAqD;IACrD,+EAA+E;IAC/E,MAAM,MAAM,GAAG,8BAA8B,CAAC,SAAS,EAAE,EAAE,UAAU,EAAE,CAAC,EAAE,CAAC,CAAC;IAC5E,MAAM,QAAQ,GAAG,8BAA8B,CAAC,EAAE,EAAE,EAAE,UAAU,EAAE,CAAC,EAAE,CAAC,CAAC;IACvE,MAAM,MAAM,GAAG,8BAA8B,CAAC,QAAQ,EAAE,EAAE,UAAU,EAAE,CAAC,EAAE,CAAC,CAAC;IAE3E,MAAM,OAAO,GAAG,CACf,cAA0B,EAC1B,IAAkC,EAClB,EAAE;QAClB,MAAM,MAAM,GAAG,MAAM,CAAC,IAAI,CAAC,cAAc,CAAC,CAAC,QAAQ,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC;QAC3E,IAAI,MAAM,KAAK,KAAK,EAAE,CAAC;YACtB,OAAO,mCAAmC,CAAC,cAAc,EAAE,CAAC,MAAM,EAAE,QAAQ,CAAC,EAAE;gBAC9E,eAAe,EAAE,IAAI,CAAC,eAAe;gBACrC,6BAA6B,EAAE,KAAK;aACpC,CAAC,CAAC;QACJ,CAAC;QACD,IAAI,MAAM,KAAK,KAAK,EAAE,CAAC;YACtB,OAAO,mCAAmC,CAAC,cAAc,EAAE,CAAC,MAAM,EAAE,QAAQ,CAAC,EAAE;gBAC9E,eAAe,EAAE,IAAI,CAAC,eAAe;gBACrC,6BAA6B,EAAE,KAAK;aACpC,CAAC,CAAC;QACJ,CAAC;QACD,OAAO,IAAI,CAAC;IACb,CAAC,CAAC;IAEF,MAAM,SAAS,GACd;QACC,MAAM;KACN,CAAC;IACH,IAAI,OAAO,CAAC,OAAO;QAAE,SAAS,CAAC,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;IACzD,IAAI,OAAO,CAAC,cAAc,KAAK,SAAS;QAAE,SAAS,CAAC,cAAc,GAAG,OAAO,CAAC,cAAc,CAAC;IAC5F,IAAI,OAAO,CAAC,KAAK,KAAK,SAAS;QAAE,SAAS,CAAC,KAAK,GAAG,OAAO,CAAC,KAAK,CAAC;IAEjE,MAAM,MAAM,GAAG,MAAM,4BAA4B,CAAC,SAAS,EAAE,OAAO,EAAE,cAAc,EAAE,OAAO,CAAC,CAAC;IAC/F,MAAM,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,eAAe,CAAC,CAAC;IAC5C,OAAO,MAAM,CAAC;AACf,CAAC"}

View File

@@ -0,0 +1,7 @@
import type { GetCookiesResult } from '../types.js';
export declare function getCookiesFromChromeSqliteMac(options: {
profile?: string;
includeExpired?: boolean;
debug?: boolean;
}, origins: string[], allowlistNames: Set<string> | null): Promise<GetCookiesResult>;
//# sourceMappingURL=chromeSqliteMac.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chromeSqliteMac.d.ts","sourceRoot":"","sources":["../../src/providers/chromeSqliteMac.ts"],"names":[],"mappings":"AAGA,OAAO,KAAK,EAAE,gBAAgB,EAAE,MAAM,aAAa,CAAC;AASpD,wBAAsB,6BAA6B,CAClD,OAAO,EAAE;IAAE,OAAO,CAAC,EAAE,MAAM,CAAC;IAAC,cAAc,CAAC,EAAE,OAAO,CAAC;IAAC,KAAK,CAAC,EAAE,OAAO,CAAA;CAAE,EACxE,OAAO,EAAE,MAAM,EAAE,EACjB,cAAc,EAAE,GAAG,CAAC,MAAM,CAAC,GAAG,IAAI,GAChC,OAAO,CAAC,gBAAgB,CAAC,CA6C3B"}

View File

@@ -0,0 +1,60 @@
import { homedir } from 'node:os';
import path from 'node:path';
import { decryptChromiumAes128CbcCookieValue, deriveAes128CbcKeyFromPassword, } from './chromeSqlite/crypto.js';
import { getCookiesFromChromeSqliteDb } from './chromeSqlite/shared.js';
import { readKeychainGenericPasswordFirst } from './chromium/macosKeychain.js';
import { resolveCookiesDbFromProfileOrRoots } from './chromium/paths.js';
export async function getCookiesFromChromeSqliteMac(options, origins, allowlistNames) {
const dbPath = resolveChromeCookiesDb(options.profile);
if (!dbPath) {
return { cookies: [], warnings: ['Chrome cookies database not found.'] };
}
const warnings = [];
// On macOS, Chrome stores its "Safe Storage" secret in Keychain.
// `security find-generic-password` is stable and avoids any native Node keychain modules.
const passwordResult = await readKeychainGenericPasswordFirst({
account: 'Chrome',
services: ['Chrome Safe Storage'],
timeoutMs: 3_000,
label: 'Chrome Safe Storage',
});
if (!passwordResult.ok) {
warnings.push(passwordResult.error);
return { cookies: [], warnings };
}
const chromePassword = passwordResult.password.trim();
if (!chromePassword) {
warnings.push('macOS Keychain returned an empty Chrome Safe Storage password.');
return { cookies: [], warnings };
}
// Chromium uses PBKDF2(password, "saltysalt", 1003, 16, sha1) for AES-128-CBC cookie values on macOS.
const key = deriveAes128CbcKeyFromPassword(chromePassword, { iterations: 1003 });
const decrypt = (encryptedValue, opts) => decryptChromiumAes128CbcCookieValue(encryptedValue, [key], {
stripHashPrefix: opts.stripHashPrefix,
treatUnknownPrefixAsPlaintext: true,
});
const dbOptions = {
dbPath,
};
if (options.profile)
dbOptions.profile = options.profile;
if (options.includeExpired !== undefined)
dbOptions.includeExpired = options.includeExpired;
if (options.debug !== undefined)
dbOptions.debug = options.debug;
const result = await getCookiesFromChromeSqliteDb(dbOptions, origins, allowlistNames, decrypt);
result.warnings.unshift(...warnings);
return result;
}
function resolveChromeCookiesDb(profile) {
const home = homedir();
/* c8 ignore next */
const roots = process.platform === 'darwin'
? [path.join(home, 'Library', 'Application Support', 'Google', 'Chrome')]
: [];
const args = { roots };
if (profile !== undefined)
args.profile = profile;
return resolveCookiesDbFromProfileOrRoots(args);
}
//# sourceMappingURL=chromeSqliteMac.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chromeSqliteMac.js","sourceRoot":"","sources":["../../src/providers/chromeSqliteMac.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,OAAO,EAAE,MAAM,SAAS,CAAC;AAClC,OAAO,IAAI,MAAM,WAAW,CAAC;AAG7B,OAAO,EACN,mCAAmC,EACnC,8BAA8B,GAC9B,MAAM,0BAA0B,CAAC;AAClC,OAAO,EAAE,4BAA4B,EAAE,MAAM,0BAA0B,CAAC;AACxE,OAAO,EAAE,gCAAgC,EAAE,MAAM,6BAA6B,CAAC;AAC/E,OAAO,EAAE,kCAAkC,EAAE,MAAM,qBAAqB,CAAC;AAEzE,MAAM,CAAC,KAAK,UAAU,6BAA6B,CAClD,OAAwE,EACxE,OAAiB,EACjB,cAAkC;IAElC,MAAM,MAAM,GAAG,sBAAsB,CAAC,OAAO,CAAC,OAAO,CAAC,CAAC;IACvD,IAAI,CAAC,MAAM,EAAE,CAAC;QACb,OAAO,EAAE,OAAO,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC,oCAAoC,CAAC,EAAE,CAAC;IAC1E,CAAC;IAED,MAAM,QAAQ,GAAa,EAAE,CAAC;IAC9B,iEAAiE;IACjE,0FAA0F;IAC1F,MAAM,cAAc,GAAG,MAAM,gCAAgC,CAAC;QAC7D,OAAO,EAAE,QAAQ;QACjB,QAAQ,EAAE,CAAC,qBAAqB,CAAC;QACjC,SAAS,EAAE,KAAK;QAChB,KAAK,EAAE,qBAAqB;KAC5B,CAAC,CAAC;IACH,IAAI,CAAC,cAAc,CAAC,EAAE,EAAE,CAAC;QACxB,QAAQ,CAAC,IAAI,CAAC,cAAc,CAAC,KAAK,CAAC,CAAC;QACpC,OAAO,EAAE,OAAO,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC;IAClC,CAAC;IAED,MAAM,cAAc,GAAG,cAAc,CAAC,QAAQ,CAAC,IAAI,EAAE,CAAC;IACtD,IAAI,CAAC,cAAc,EAAE,CAAC;QACrB,QAAQ,CAAC,IAAI,CAAC,gEAAgE,CAAC,CAAC;QAChF,OAAO,EAAE,OAAO,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC;IAClC,CAAC;IAED,sGAAsG;IACtG,MAAM,GAAG,GAAG,8BAA8B,CAAC,cAAc,EAAE,EAAE,UAAU,EAAE,IAAI,EAAE,CAAC,CAAC;IACjF,MAAM,OAAO,GAAG,CAAC,cAA0B,EAAE,IAAkC,EAAiB,EAAE,CACjG,mCAAmC,CAAC,cAAc,EAAE,CAAC,GAAG,CAAC,EAAE;QAC1D,eAAe,EAAE,IAAI,CAAC,eAAe;QACrC,6BAA6B,EAAE,IAAI;KACnC,CAAC,CAAC;IAEJ,MAAM,SAAS,GACd;QACC,MAAM;KACN,CAAC;IACH,IAAI,OAAO,CAAC,OAAO;QAAE,SAAS,CAAC,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;IACzD,IAAI,OAAO,CAAC,cAAc,KAAK,SAAS;QAAE,SAAS,CAAC,cAAc,GAAG,OAAO,CAAC,cAAc,CAAC;IAC5F,IAAI,OAAO,CAAC,KAAK,KAAK,SAAS;QAAE,SAAS,CAAC,KAAK,GAAG,OAAO,CAAC,KAAK,CAAC;IAEjE,MAAM,MAAM,GAAG,MAAM,4BAA4B,CAAC,SAAS,EAAE,OAAO,EAAE,cAAc,EAAE,OAAO,CAAC,CAAC;IAC/F,MAAM,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,QAAQ,CAAC,CAAC;IACrC,OAAO,MAAM,CAAC;AACf,CAAC;AAED,SAAS,sBAAsB,CAAC,OAAgB;IAC/C,MAAM,IAAI,GAAG,OAAO,EAAE,CAAC;IACvB,oBAAoB;IACpB,MAAM,KAAK,GACV,OAAO,CAAC,QAAQ,KAAK,QAAQ;QAC5B,CAAC,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,SAAS,EAAE,qBAAqB,EAAE,QAAQ,EAAE,QAAQ,CAAC,CAAC;QACzE,CAAC,CAAC,EAAE,CAAC;IACP,MAAM,IAAI,GAA6D,EAAE,KAAK,EAAE,CAAC;IACjF,IAAI,OAAO,KAAK,SAAS;QAAE,IAAI,CAAC,OAAO,GAAG,OAAO,CAAC;IAClD,OAAO,kCAAkC,CAAC,IAAI,CAAC,CAAC;AACjD,CAAC"}

View File

@@ -0,0 +1,7 @@
import type { GetCookiesResult } from '../types.js';
export declare function getCookiesFromChromeSqliteWindows(options: {
profile?: string;
includeExpired?: boolean;
debug?: boolean;
}, origins: string[], allowlistNames: Set<string> | null): Promise<GetCookiesResult>;
//# sourceMappingURL=chromeSqliteWindows.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chromeSqliteWindows.d.ts","sourceRoot":"","sources":["../../src/providers/chromeSqliteWindows.ts"],"names":[],"mappings":"AAEA,OAAO,KAAK,EAAE,gBAAgB,EAAE,MAAM,aAAa,CAAC;AAMpD,wBAAsB,iCAAiC,CACtD,OAAO,EAAE;IAAE,OAAO,CAAC,EAAE,MAAM,CAAC;IAAC,cAAc,CAAC,EAAE,OAAO,CAAC;IAAC,KAAK,CAAC,EAAE,OAAO,CAAA;CAAE,EACxE,OAAO,EAAE,MAAM,EAAE,EACjB,cAAc,EAAE,GAAG,CAAC,MAAM,CAAC,GAAG,IAAI,GAChC,OAAO,CAAC,gBAAgB,CAAC,CAmC3B"}

View File

@@ -0,0 +1,38 @@
import path from 'node:path';
import { decryptChromiumAes256GcmCookieValue } from './chromeSqlite/crypto.js';
import { getCookiesFromChromeSqliteDb } from './chromeSqlite/shared.js';
import { getWindowsChromiumMasterKey } from './chromium/windowsMasterKey.js';
import { resolveChromiumPathsWindows } from './chromium/windowsPaths.js';
export async function getCookiesFromChromeSqliteWindows(options, origins, allowlistNames) {
const resolveArgs = {
localAppDataVendorPath: path.join('Google', 'Chrome', 'User Data'),
};
if (options.profile !== undefined)
resolveArgs.profile = options.profile;
const { dbPath, userDataDir } = resolveChromiumPathsWindows(resolveArgs);
if (!dbPath || !userDataDir) {
return { cookies: [], warnings: ['Chrome cookies database not found.'] };
}
// On Windows, Chrome stores an AES key in `Local State` encrypted with DPAPI (CurrentUser).
// That master key is then used for AES-256-GCM cookie values (`v10`/`v11`/`v20` prefixes).
const masterKey = await getWindowsChromiumMasterKey(userDataDir, 'Chrome');
if (!masterKey.ok) {
return { cookies: [], warnings: [masterKey.error] };
}
const decrypt = (encryptedValue, opts) => {
return decryptChromiumAes256GcmCookieValue(encryptedValue, masterKey.value, {
stripHashPrefix: opts.stripHashPrefix,
});
};
const dbOptions = {
dbPath,
};
if (options.profile)
dbOptions.profile = options.profile;
if (options.includeExpired !== undefined)
dbOptions.includeExpired = options.includeExpired;
if (options.debug !== undefined)
dbOptions.debug = options.debug;
return await getCookiesFromChromeSqliteDb(dbOptions, origins, allowlistNames, decrypt);
}
//# sourceMappingURL=chromeSqliteWindows.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"chromeSqliteWindows.js","sourceRoot":"","sources":["../../src/providers/chromeSqliteWindows.ts"],"names":[],"mappings":"AAAA,OAAO,IAAI,MAAM,WAAW,CAAC;AAG7B,OAAO,EAAE,mCAAmC,EAAE,MAAM,0BAA0B,CAAC;AAC/E,OAAO,EAAE,4BAA4B,EAAE,MAAM,0BAA0B,CAAC;AACxE,OAAO,EAAE,2BAA2B,EAAE,MAAM,gCAAgC,CAAC;AAC7E,OAAO,EAAE,2BAA2B,EAAE,MAAM,4BAA4B,CAAC;AAEzE,MAAM,CAAC,KAAK,UAAU,iCAAiC,CACtD,OAAwE,EACxE,OAAiB,EACjB,cAAkC;IAElC,MAAM,WAAW,GAAsD;QACtE,sBAAsB,EAAE,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,QAAQ,EAAE,WAAW,CAAC;KAClE,CAAC;IACF,IAAI,OAAO,CAAC,OAAO,KAAK,SAAS;QAAE,WAAW,CAAC,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;IACzE,MAAM,EAAE,MAAM,EAAE,WAAW,EAAE,GAAG,2BAA2B,CAAC,WAAW,CAAC,CAAC;IACzE,IAAI,CAAC,MAAM,IAAI,CAAC,WAAW,EAAE,CAAC;QAC7B,OAAO,EAAE,OAAO,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC,oCAAoC,CAAC,EAAE,CAAC;IAC1E,CAAC;IAED,4FAA4F;IAC5F,2FAA2F;IAC3F,MAAM,SAAS,GAAG,MAAM,2BAA2B,CAAC,WAAW,EAAE,QAAQ,CAAC,CAAC;IAC3E,IAAI,CAAC,SAAS,CAAC,EAAE,EAAE,CAAC;QACnB,OAAO,EAAE,OAAO,EAAE,EAAE,EAAE,QAAQ,EAAE,CAAC,SAAS,CAAC,KAAK,CAAC,EAAE,CAAC;IACrD,CAAC;IAED,MAAM,OAAO,GAAG,CACf,cAA0B,EAC1B,IAAkC,EAClB,EAAE;QAClB,OAAO,mCAAmC,CAAC,cAAc,EAAE,SAAS,CAAC,KAAK,EAAE;YAC3E,eAAe,EAAE,IAAI,CAAC,eAAe;SACrC,CAAC,CAAC;IACJ,CAAC,CAAC;IAEF,MAAM,SAAS,GACd;QACC,MAAM;KACN,CAAC;IACH,IAAI,OAAO,CAAC,OAAO;QAAE,SAAS,CAAC,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;IACzD,IAAI,OAAO,CAAC,cAAc,KAAK,SAAS;QAAE,SAAS,CAAC,cAAc,GAAG,OAAO,CAAC,cAAc,CAAC;IAC5F,IAAI,OAAO,CAAC,KAAK,KAAK,SAAS;QAAE,SAAS,CAAC,KAAK,GAAG,OAAO,CAAC,KAAK,CAAC;IAEjE,OAAO,MAAM,4BAA4B,CAAC,SAAS,EAAE,OAAO,EAAE,cAAc,EAAE,OAAO,CAAC,CAAC;AACxF,CAAC"}

View File

@@ -0,0 +1,5 @@
export declare function resolveChromiumCookiesDbLinux(options: {
configDirName: string;
profile?: string;
}): string | null;
//# sourceMappingURL=linuxPaths.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"linuxPaths.d.ts","sourceRoot":"","sources":["../../../src/providers/chromium/linuxPaths.ts"],"names":[],"mappings":"AAMA,wBAAgB,6BAA6B,CAAC,OAAO,EAAE;IACtD,aAAa,EAAE,MAAM,CAAC;IACtB,OAAO,CAAC,EAAE,MAAM,CAAC;CACjB,GAAG,MAAM,GAAG,IAAI,CA0BhB"}

View File

@@ -0,0 +1,33 @@
import { existsSync } from 'node:fs';
import { homedir } from 'node:os';
import path from 'node:path';
import { expandPath, looksLikePath } from './paths.js';
export function resolveChromiumCookiesDbLinux(options) {
const home = homedir();
// biome-ignore lint/complexity/useLiteralKeys: process.env is an index signature under strict TS.
const configHome = process.env['XDG_CONFIG_HOME']?.trim() || path.join(home, '.config');
const root = path.join(configHome, options.configDirName);
if (options.profile && looksLikePath(options.profile)) {
const candidate = expandPath(options.profile);
if (candidate.endsWith('Cookies') && existsSync(candidate))
return candidate;
const direct = path.join(candidate, 'Cookies');
if (existsSync(direct))
return direct;
const network = path.join(candidate, 'Network', 'Cookies');
if (existsSync(network))
return network;
return null;
}
const profileDir = options.profile && options.profile.trim().length > 0 ? options.profile.trim() : 'Default';
const candidates = [
path.join(root, profileDir, 'Cookies'),
path.join(root, profileDir, 'Network', 'Cookies'),
];
for (const candidate of candidates) {
if (existsSync(candidate))
return candidate;
}
return null;
}
//# sourceMappingURL=linuxPaths.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"linuxPaths.js","sourceRoot":"","sources":["../../../src/providers/chromium/linuxPaths.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,MAAM,SAAS,CAAC;AACrC,OAAO,EAAE,OAAO,EAAE,MAAM,SAAS,CAAC;AAClC,OAAO,IAAI,MAAM,WAAW,CAAC;AAE7B,OAAO,EAAE,UAAU,EAAE,aAAa,EAAE,MAAM,YAAY,CAAC;AAEvD,MAAM,UAAU,6BAA6B,CAAC,OAG7C;IACA,MAAM,IAAI,GAAG,OAAO,EAAE,CAAC;IACvB,kGAAkG;IAClG,MAAM,UAAU,GAAG,OAAO,CAAC,GAAG,CAAC,iBAAiB,CAAC,EAAE,IAAI,EAAE,IAAI,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,SAAS,CAAC,CAAC;IACxF,MAAM,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC,UAAU,EAAE,OAAO,CAAC,aAAa,CAAC,CAAC;IAE1D,IAAI,OAAO,CAAC,OAAO,IAAI,aAAa,CAAC,OAAO,CAAC,OAAO,CAAC,EAAE,CAAC;QACvD,MAAM,SAAS,GAAG,UAAU,CAAC,OAAO,CAAC,OAAO,CAAC,CAAC;QAC9C,IAAI,SAAS,CAAC,QAAQ,CAAC,SAAS,CAAC,IAAI,UAAU,CAAC,SAAS,CAAC;YAAE,OAAO,SAAS,CAAC;QAC7E,MAAM,MAAM,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,SAAS,CAAC,CAAC;QAC/C,IAAI,UAAU,CAAC,MAAM,CAAC;YAAE,OAAO,MAAM,CAAC;QACtC,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,SAAS,EAAE,SAAS,CAAC,CAAC;QAC3D,IAAI,UAAU,CAAC,OAAO,CAAC;YAAE,OAAO,OAAO,CAAC;QACxC,OAAO,IAAI,CAAC;IACb,CAAC;IAED,MAAM,UAAU,GACf,OAAO,CAAC,OAAO,IAAI,OAAO,CAAC,OAAO,CAAC,IAAI,EAAE,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,SAAS,CAAC;IAC3F,MAAM,UAAU,GAAG;QAClB,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,UAAU,EAAE,SAAS,CAAC;QACtC,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,UAAU,EAAE,SAAS,EAAE,SAAS,CAAC;KACjD,CAAC;IACF,KAAK,MAAM,SAAS,IAAI,UAAU,EAAE,CAAC;QACpC,IAAI,UAAU,CAAC,SAAS,CAAC;YAAE,OAAO,SAAS,CAAC;IAC7C,CAAC;IACD,OAAO,IAAI,CAAC;AACb,CAAC"}

View File

@@ -0,0 +1,24 @@
export declare function readKeychainGenericPassword(options: {
account: string;
service: string;
timeoutMs: number;
}): Promise<{
ok: true;
password: string;
} | {
ok: false;
error: string;
}>;
export declare function readKeychainGenericPasswordFirst(options: {
account: string;
services: string[];
timeoutMs: number;
label: string;
}): Promise<{
ok: true;
password: string;
} | {
ok: false;
error: string;
}>;
//# sourceMappingURL=macosKeychain.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"macosKeychain.d.ts","sourceRoot":"","sources":["../../../src/providers/chromium/macosKeychain.ts"],"names":[],"mappings":"AAEA,wBAAsB,2BAA2B,CAAC,OAAO,EAAE;IAC1D,OAAO,EAAE,MAAM,CAAC;IAChB,OAAO,EAAE,MAAM,CAAC;IAChB,SAAS,EAAE,MAAM,CAAC;CAClB,GAAG,OAAO,CAAC;IAAE,EAAE,EAAE,IAAI,CAAC;IAAC,QAAQ,EAAE,MAAM,CAAA;CAAE,GAAG;IAAE,EAAE,EAAE,KAAK,CAAC;IAAC,KAAK,EAAE,MAAM,CAAA;CAAE,CAAC,CAczE;AAED,wBAAsB,gCAAgC,CAAC,OAAO,EAAE;IAC/D,OAAO,EAAE,MAAM,CAAC;IAChB,QAAQ,EAAE,MAAM,EAAE,CAAC;IACnB,SAAS,EAAE,MAAM,CAAC;IAClB,KAAK,EAAE,MAAM,CAAC;CACd,GAAG,OAAO,CAAC;IAAE,EAAE,EAAE,IAAI,CAAC;IAAC,QAAQ,EAAE,MAAM,CAAA;CAAE,GAAG;IAAE,EAAE,EAAE,KAAK,CAAC;IAAC,KAAK,EAAE,MAAM,CAAA;CAAE,CAAC,CAgBzE"}

View File

@@ -0,0 +1,30 @@
import { execCapture } from '../../util/exec.js';
export async function readKeychainGenericPassword(options) {
const res = await execCapture('security', ['find-generic-password', '-w', '-a', options.account, '-s', options.service], { timeoutMs: options.timeoutMs });
if (res.code === 0) {
const password = res.stdout.trim();
return { ok: true, password };
}
return {
ok: false,
error: `${res.stderr.trim() || `exit ${res.code}`}`,
};
}
export async function readKeychainGenericPasswordFirst(options) {
let lastError = null;
for (const service of options.services) {
const r = await readKeychainGenericPassword({
account: options.account,
service,
timeoutMs: options.timeoutMs,
});
if (r.ok)
return r;
lastError = r.error;
}
return {
ok: false,
error: `Failed to read macOS Keychain (${options.label}): ${lastError ?? 'permission denied / keychain locked / entry missing.'}`,
};
}
//# sourceMappingURL=macosKeychain.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"macosKeychain.js","sourceRoot":"","sources":["../../../src/providers/chromium/macosKeychain.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,WAAW,EAAE,MAAM,oBAAoB,CAAC;AAEjD,MAAM,CAAC,KAAK,UAAU,2BAA2B,CAAC,OAIjD;IACA,MAAM,GAAG,GAAG,MAAM,WAAW,CAC5B,UAAU,EACV,CAAC,uBAAuB,EAAE,IAAI,EAAE,IAAI,EAAE,OAAO,CAAC,OAAO,EAAE,IAAI,EAAE,OAAO,CAAC,OAAO,CAAC,EAC7E,EAAE,SAAS,EAAE,OAAO,CAAC,SAAS,EAAE,CAChC,CAAC;IACF,IAAI,GAAG,CAAC,IAAI,KAAK,CAAC,EAAE,CAAC;QACpB,MAAM,QAAQ,GAAG,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,CAAC;QACnC,OAAO,EAAE,EAAE,EAAE,IAAI,EAAE,QAAQ,EAAE,CAAC;IAC/B,CAAC;IACD,OAAO;QACN,EAAE,EAAE,KAAK;QACT,KAAK,EAAE,GAAG,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,IAAI,QAAQ,GAAG,CAAC,IAAI,EAAE,EAAE;KACnD,CAAC;AACH,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,gCAAgC,CAAC,OAKtD;IACA,IAAI,SAAS,GAAkB,IAAI,CAAC;IACpC,KAAK,MAAM,OAAO,IAAI,OAAO,CAAC,QAAQ,EAAE,CAAC;QACxC,MAAM,CAAC,GAAG,MAAM,2BAA2B,CAAC;YAC3C,OAAO,EAAE,OAAO,CAAC,OAAO;YACxB,OAAO;YACP,SAAS,EAAE,OAAO,CAAC,SAAS;SAC5B,CAAC,CAAC;QACH,IAAI,CAAC,CAAC,EAAE;YAAE,OAAO,CAAC,CAAC;QACnB,SAAS,GAAG,CAAC,CAAC,KAAK,CAAC;IACrB,CAAC;IAED,OAAO;QACN,EAAE,EAAE,KAAK;QACT,KAAK,EAAE,kCAAkC,OAAO,CAAC,KAAK,MAAM,SAAS,IAAI,sDAAsD,EAAE;KACjI,CAAC;AACH,CAAC"}

View File

@@ -0,0 +1,11 @@
export declare function looksLikePath(value: string): boolean;
export declare function expandPath(input: string): string;
export declare function safeStat(candidate: string): {
isFile: () => boolean;
isDirectory: () => boolean;
} | null;
export declare function resolveCookiesDbFromProfileOrRoots(options: {
profile?: string;
roots: string[];
}): string | null;
//# sourceMappingURL=paths.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"paths.d.ts","sourceRoot":"","sources":["../../../src/providers/chromium/paths.ts"],"names":[],"mappings":"AAIA,wBAAgB,aAAa,CAAC,KAAK,EAAE,MAAM,GAAG,OAAO,CAEpD;AAED,wBAAgB,UAAU,CAAC,KAAK,EAAE,MAAM,GAAG,MAAM,CAGhD;AAED,wBAAgB,QAAQ,CACvB,SAAS,EAAE,MAAM,GACf;IAAE,MAAM,EAAE,MAAM,OAAO,CAAC;IAAC,WAAW,EAAE,MAAM,OAAO,CAAA;CAAE,GAAG,IAAI,CAM9D;AAED,wBAAgB,kCAAkC,CAAC,OAAO,EAAE;IAC3D,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB,KAAK,EAAE,MAAM,EAAE,CAAC;CAChB,GAAG,MAAM,GAAG,IAAI,CAuBhB"}

View File

@@ -0,0 +1,43 @@
import { existsSync, statSync } from 'node:fs';
import { homedir } from 'node:os';
import path from 'node:path';
export function looksLikePath(value) {
return value.includes('/') || value.includes('\\');
}
export function expandPath(input) {
if (input.startsWith('~/'))
return path.join(homedir(), input.slice(2));
return path.isAbsolute(input) ? input : path.resolve(process.cwd(), input);
}
export function safeStat(candidate) {
try {
return statSync(candidate);
}
catch {
return null;
}
}
export function resolveCookiesDbFromProfileOrRoots(options) {
const candidates = [];
if (options.profile && looksLikePath(options.profile)) {
const expanded = expandPath(options.profile);
const stat = safeStat(expanded);
if (stat?.isFile())
return expanded;
candidates.push(path.join(expanded, 'Cookies'));
candidates.push(path.join(expanded, 'Network', 'Cookies'));
}
else {
const profileDir = options.profile && options.profile.trim().length > 0 ? options.profile.trim() : 'Default';
for (const root of options.roots) {
candidates.push(path.join(root, profileDir, 'Cookies'));
candidates.push(path.join(root, profileDir, 'Network', 'Cookies'));
}
}
for (const candidate of candidates) {
if (existsSync(candidate))
return candidate;
}
return null;
}
//# sourceMappingURL=paths.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"paths.js","sourceRoot":"","sources":["../../../src/providers/chromium/paths.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,QAAQ,EAAE,MAAM,SAAS,CAAC;AAC/C,OAAO,EAAE,OAAO,EAAE,MAAM,SAAS,CAAC;AAClC,OAAO,IAAI,MAAM,WAAW,CAAC;AAE7B,MAAM,UAAU,aAAa,CAAC,KAAa;IAC1C,OAAO,KAAK,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,KAAK,CAAC,QAAQ,CAAC,IAAI,CAAC,CAAC;AACpD,CAAC;AAED,MAAM,UAAU,UAAU,CAAC,KAAa;IACvC,IAAI,KAAK,CAAC,UAAU,CAAC,IAAI,CAAC;QAAE,OAAO,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,EAAE,KAAK,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC;IACxE,OAAO,IAAI,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,IAAI,CAAC,OAAO,CAAC,OAAO,CAAC,GAAG,EAAE,EAAE,KAAK,CAAC,CAAC;AAC5E,CAAC;AAED,MAAM,UAAU,QAAQ,CACvB,SAAiB;IAEjB,IAAI,CAAC;QACJ,OAAO,QAAQ,CAAC,SAAS,CAAC,CAAC;IAC5B,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,IAAI,CAAC;IACb,CAAC;AACF,CAAC;AAED,MAAM,UAAU,kCAAkC,CAAC,OAGlD;IACA,MAAM,UAAU,GAAa,EAAE,CAAC;IAEhC,IAAI,OAAO,CAAC,OAAO,IAAI,aAAa,CAAC,OAAO,CAAC,OAAO,CAAC,EAAE,CAAC;QACvD,MAAM,QAAQ,GAAG,UAAU,CAAC,OAAO,CAAC,OAAO,CAAC,CAAC;QAC7C,MAAM,IAAI,GAAG,QAAQ,CAAC,QAAQ,CAAC,CAAC;QAChC,IAAI,IAAI,EAAE,MAAM,EAAE;YAAE,OAAO,QAAQ,CAAC;QACpC,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,SAAS,CAAC,CAAC,CAAC;QAChD,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,SAAS,EAAE,SAAS,CAAC,CAAC,CAAC;IAC5D,CAAC;SAAM,CAAC;QACP,MAAM,UAAU,GACf,OAAO,CAAC,OAAO,IAAI,OAAO,CAAC,OAAO,CAAC,IAAI,EAAE,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,SAAS,CAAC;QAC3F,KAAK,MAAM,IAAI,IAAI,OAAO,CAAC,KAAK,EAAE,CAAC;YAClC,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,UAAU,EAAE,SAAS,CAAC,CAAC,CAAC;YACxD,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,UAAU,EAAE,SAAS,EAAE,SAAS,CAAC,CAAC,CAAC;QACpE,CAAC;IACF,CAAC;IAED,KAAK,MAAM,SAAS,IAAI,UAAU,EAAE,CAAC;QACpC,IAAI,UAAU,CAAC,SAAS,CAAC;YAAE,OAAO,SAAS,CAAC;IAC7C,CAAC;IAED,OAAO,IAAI,CAAC;AACb,CAAC"}

View File

@@ -0,0 +1,8 @@
export declare function getWindowsChromiumMasterKey(userDataDir: string, label: string): Promise<{
ok: true;
value: Buffer;
} | {
ok: false;
error: string;
}>;
//# sourceMappingURL=windowsMasterKey.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"windowsMasterKey.d.ts","sourceRoot":"","sources":["../../../src/providers/chromium/windowsMasterKey.ts"],"names":[],"mappings":"AAKA,wBAAsB,2BAA2B,CAChD,WAAW,EAAE,MAAM,EACnB,KAAK,EAAE,MAAM,GACX,OAAO,CACP;IAAE,EAAE,EAAE,IAAI,CAAC;IAAC,KAAK,EAAE,MAAM,CAAA;CAAE,GAC3B;IACA,EAAE,EAAE,KAAK,CAAC;IACV,KAAK,EAAE,MAAM,CAAC;CACb,CACH,CAsCA"}

View File

@@ -0,0 +1,41 @@
import { existsSync, readFileSync } from 'node:fs';
import path from 'node:path';
import { dpapiUnprotect } from '../chromeSqlite/windowsDpapi.js';
export async function getWindowsChromiumMasterKey(userDataDir, label) {
const localStatePath = path.join(userDataDir, 'Local State');
if (!existsSync(localStatePath)) {
return { ok: false, error: `${label} Local State file not found.` };
}
let encryptedKeyB64 = null;
try {
const raw = readFileSync(localStatePath, 'utf8');
const parsed = JSON.parse(raw);
encryptedKeyB64 =
typeof parsed.os_crypt?.encrypted_key === 'string' ? parsed.os_crypt.encrypted_key : null;
}
catch (error) {
return {
ok: false,
error: `Failed to parse ${label} Local State: ${error instanceof Error ? error.message : String(error)}`,
};
}
if (!encryptedKeyB64)
return { ok: false, error: `${label} Local State missing os_crypt.encrypted_key.` };
let encryptedKey;
try {
encryptedKey = Buffer.from(encryptedKeyB64, 'base64');
}
catch {
return { ok: false, error: `${label} Local State contains an invalid encrypted_key.` };
}
const prefix = Buffer.from('DPAPI', 'utf8');
if (!encryptedKey.subarray(0, prefix.length).equals(prefix)) {
return { ok: false, error: `${label} encrypted_key does not start with DPAPI prefix.` };
}
const unprotected = await dpapiUnprotect(encryptedKey.subarray(prefix.length));
if (!unprotected.ok) {
return { ok: false, error: `DPAPI decrypt failed: ${unprotected.error}` };
}
return { ok: true, value: unprotected.value };
}
//# sourceMappingURL=windowsMasterKey.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"windowsMasterKey.js","sourceRoot":"","sources":["../../../src/providers/chromium/windowsMasterKey.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,YAAY,EAAE,MAAM,SAAS,CAAC;AACnD,OAAO,IAAI,MAAM,WAAW,CAAC;AAE7B,OAAO,EAAE,cAAc,EAAE,MAAM,iCAAiC,CAAC;AAEjE,MAAM,CAAC,KAAK,UAAU,2BAA2B,CAChD,WAAmB,EACnB,KAAa;IAQb,MAAM,cAAc,GAAG,IAAI,CAAC,IAAI,CAAC,WAAW,EAAE,aAAa,CAAC,CAAC;IAC7D,IAAI,CAAC,UAAU,CAAC,cAAc,CAAC,EAAE,CAAC;QACjC,OAAO,EAAE,EAAE,EAAE,KAAK,EAAE,KAAK,EAAE,GAAG,KAAK,8BAA8B,EAAE,CAAC;IACrE,CAAC;IACD,IAAI,eAAe,GAAkB,IAAI,CAAC;IAC1C,IAAI,CAAC;QACJ,MAAM,GAAG,GAAG,YAAY,CAAC,cAAc,EAAE,MAAM,CAAC,CAAC;QACjD,MAAM,MAAM,GAAG,IAAI,CAAC,KAAK,CAAC,GAAG,CAA+C,CAAC;QAC7E,eAAe;YACd,OAAO,MAAM,CAAC,QAAQ,EAAE,aAAa,KAAK,QAAQ,CAAC,CAAC,CAAC,MAAM,CAAC,QAAQ,CAAC,aAAa,CAAC,CAAC,CAAC,IAAI,CAAC;IAC5F,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QAChB,OAAO;YACN,EAAE,EAAE,KAAK;YACT,KAAK,EAAE,mBAAmB,KAAK,iBAAiB,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC,EAAE;SACxG,CAAC;IACH,CAAC;IAED,IAAI,CAAC,eAAe;QACnB,OAAO,EAAE,EAAE,EAAE,KAAK,EAAE,KAAK,EAAE,GAAG,KAAK,8CAA8C,EAAE,CAAC;IAErF,IAAI,YAAoB,CAAC;IACzB,IAAI,CAAC;QACJ,YAAY,GAAG,MAAM,CAAC,IAAI,CAAC,eAAe,EAAE,QAAQ,CAAC,CAAC;IACvD,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,EAAE,EAAE,EAAE,KAAK,EAAE,KAAK,EAAE,GAAG,KAAK,iDAAiD,EAAE,CAAC;IACxF,CAAC;IAED,MAAM,MAAM,GAAG,MAAM,CAAC,IAAI,CAAC,OAAO,EAAE,MAAM,CAAC,CAAC;IAC5C,IAAI,CAAC,YAAY,CAAC,QAAQ,CAAC,CAAC,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC,MAAM,CAAC,MAAM,CAAC,EAAE,CAAC;QAC7D,OAAO,EAAE,EAAE,EAAE,KAAK,EAAE,KAAK,EAAE,GAAG,KAAK,kDAAkD,EAAE,CAAC;IACzF,CAAC;IAED,MAAM,WAAW,GAAG,MAAM,cAAc,CAAC,YAAY,CAAC,QAAQ,CAAC,MAAM,CAAC,MAAM,CAAC,CAAC,CAAC;IAC/E,IAAI,CAAC,WAAW,CAAC,EAAE,EAAE,CAAC;QACrB,OAAO,EAAE,EAAE,EAAE,KAAK,EAAE,KAAK,EAAE,yBAAyB,WAAW,CAAC,KAAK,EAAE,EAAE,CAAC;IAC3E,CAAC;IACD,OAAO,EAAE,EAAE,EAAE,IAAI,EAAE,KAAK,EAAE,WAAW,CAAC,KAAK,EAAE,CAAC;AAC/C,CAAC"}

View File

@@ -0,0 +1,8 @@
export declare function resolveChromiumPathsWindows(options: {
localAppDataVendorPath: string;
profile?: string;
}): {
dbPath: string | null;
userDataDir: string | null;
};
//# sourceMappingURL=windowsPaths.d.ts.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"windowsPaths.d.ts","sourceRoot":"","sources":["../../../src/providers/chromium/windowsPaths.ts"],"names":[],"mappings":"AAKA,wBAAgB,2BAA2B,CAAC,OAAO,EAAE;IACpD,sBAAsB,EAAE,MAAM,CAAC;IAC/B,OAAO,CAAC,EAAE,MAAM,CAAC;CACjB,GAAG;IAAE,MAAM,EAAE,MAAM,GAAG,IAAI,CAAC;IAAC,WAAW,EAAE,MAAM,GAAG,IAAI,CAAA;CAAE,CAmCxD"}

View File

@@ -0,0 +1,53 @@
import { existsSync } from 'node:fs';
import path from 'node:path';
import { expandPath, looksLikePath } from './paths.js';
export function resolveChromiumPathsWindows(options) {
// biome-ignore lint/complexity/useLiteralKeys: process.env is an index signature under strict TS.
const localAppData = process.env['LOCALAPPDATA'];
const root = localAppData ? path.join(localAppData, options.localAppDataVendorPath) : null;
if (options.profile && looksLikePath(options.profile)) {
const expanded = expandPath(options.profile);
const candidates = expanded.endsWith('Cookies')
? [expanded]
: [
path.join(expanded, 'Network', 'Cookies'),
path.join(expanded, 'Cookies'),
path.join(expanded, 'Default', 'Network', 'Cookies'),
];
for (const candidate of candidates) {
if (!existsSync(candidate))
continue;
const userDataDir = findUserDataDir(candidate);
return { dbPath: candidate, userDataDir };
}
if (existsSync(path.join(expanded, 'Local State'))) {
return { dbPath: null, userDataDir: expanded };
}
}
const profileDir = options.profile && options.profile.trim().length > 0 ? options.profile.trim() : 'Default';
if (!root)
return { dbPath: null, userDataDir: null };
const candidates = [
path.join(root, profileDir, 'Network', 'Cookies'),
path.join(root, profileDir, 'Cookies'),
];
for (const candidate of candidates) {
if (existsSync(candidate))
return { dbPath: candidate, userDataDir: root };
}
return { dbPath: null, userDataDir: root };
}
function findUserDataDir(cookiesDbPath) {
let current = path.dirname(cookiesDbPath);
for (let i = 0; i < 6; i += 1) {
const localState = path.join(current, 'Local State');
if (existsSync(localState))
return current;
const next = path.dirname(current);
if (next === current)
break;
current = next;
}
return null;
}
//# sourceMappingURL=windowsPaths.js.map

View File

@@ -0,0 +1 @@
{"version":3,"file":"windowsPaths.js","sourceRoot":"","sources":["../../../src/providers/chromium/windowsPaths.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,MAAM,SAAS,CAAC;AACrC,OAAO,IAAI,MAAM,WAAW,CAAC;AAE7B,OAAO,EAAE,UAAU,EAAE,aAAa,EAAE,MAAM,YAAY,CAAC;AAEvD,MAAM,UAAU,2BAA2B,CAAC,OAG3C;IACA,kGAAkG;IAClG,MAAM,YAAY,GAAG,OAAO,CAAC,GAAG,CAAC,cAAc,CAAC,CAAC;IACjD,MAAM,IAAI,GAAG,YAAY,CAAC,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,YAAY,EAAE,OAAO,CAAC,sBAAsB,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC;IAE3F,IAAI,OAAO,CAAC,OAAO,IAAI,aAAa,CAAC,OAAO,CAAC,OAAO,CAAC,EAAE,CAAC;QACvD,MAAM,QAAQ,GAAG,UAAU,CAAC,OAAO,CAAC,OAAO,CAAC,CAAC;QAC7C,MAAM,UAAU,GAAG,QAAQ,CAAC,QAAQ,CAAC,SAAS,CAAC;YAC9C,CAAC,CAAC,CAAC,QAAQ,CAAC;YACZ,CAAC,CAAC;gBACA,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,SAAS,EAAE,SAAS,CAAC;gBACzC,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,SAAS,CAAC;gBAC9B,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,SAAS,EAAE,SAAS,EAAE,SAAS,CAAC;aACpD,CAAC;QACJ,KAAK,MAAM,SAAS,IAAI,UAAU,EAAE,CAAC;YACpC,IAAI,CAAC,UAAU,CAAC,SAAS,CAAC;gBAAE,SAAS;YACrC,MAAM,WAAW,GAAG,eAAe,CAAC,SAAS,CAAC,CAAC;YAC/C,OAAO,EAAE,MAAM,EAAE,SAAS,EAAE,WAAW,EAAE,CAAC;QAC3C,CAAC;QACD,IAAI,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,aAAa,CAAC,CAAC,EAAE,CAAC;YACpD,OAAO,EAAE,MAAM,EAAE,IAAI,EAAE,WAAW,EAAE,QAAQ,EAAE,CAAC;QAChD,CAAC;IACF,CAAC;IAED,MAAM,UAAU,GACf,OAAO,CAAC,OAAO,IAAI,OAAO,CAAC,OAAO,CAAC,IAAI,EAAE,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,SAAS,CAAC;IAC3F,IAAI,CAAC,IAAI;QAAE,OAAO,EAAE,MAAM,EAAE,IAAI,EAAE,WAAW,EAAE,IAAI,EAAE,CAAC;IACtD,MAAM,UAAU,GAAG;QAClB,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,UAAU,EAAE,SAAS,EAAE,SAAS,CAAC;QACjD,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,UAAU,EAAE,SAAS,CAAC;KACtC,CAAC;IACF,KAAK,MAAM,SAAS,IAAI,UAAU,EAAE,CAAC;QACpC,IAAI,UAAU,CAAC,SAAS,CAAC;YAAE,OAAO,EAAE,MAAM,EAAE,SAAS,EAAE,WAAW,EAAE,IAAI,EAAE,CAAC;IAC5E,CAAC;IACD,OAAO,EAAE,MAAM,EAAE,IAAI,EAAE,WAAW,EAAE,IAAI,EAAE,CAAC;AAC5C,CAAC;AAED,SAAS,eAAe,CAAC,aAAqB;IAC7C,IAAI,OAAO,GAAG,IAAI,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;IAC1C,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,CAAC,EAAE,CAAC,IAAI,CAAC,EAAE,CAAC;QAC/B,MAAM,UAAU,GAAG,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,aAAa,CAAC,CAAC;QACrD,IAAI,UAAU,CAAC,UAAU,CAAC;YAAE,OAAO,OAAO,CAAC;QAC3C,MAAM,IAAI,GAAG,IAAI,CAAC,OAAO,CAAC,OAAO,CAAC,CAAC;QACnC,IAAI,IAAI,KAAK,OAAO;YAAE,MAAM;QAC5B,OAAO,GAAG,IAAI,CAAC;IAChB,CAAC;IACD,OAAO,IAAI,CAAC;AACb,CAAC"}

View File

@@ -0,0 +1,8 @@
import type { GetCookiesResult } from '../types.js';
export declare function getCookiesFromEdge(options: {
profile?: string;
timeoutMs?: number;
includeExpired?: boolean;
debug?: boolean;
}, origins: string[], allowlistNames: Set<string> | null): Promise<GetCookiesResult>;
//# sourceMappingURL=edge.d.ts.map

Some files were not shown because too many files have changed in this diff Show More