Driving an API-less blog from a Python CLI
This blog is a git repo of markdown files. I edit posts locally and push them live with a 440-line Python CLI (bearblog.py, requests the only dependency). Bear Blog has no public API, so the CLI logs in with account credentials, keeps the session cookie, and drives the same dashboard form endpoints the web UI uses.
python bearblog.py list # uid + title of every post
python bearblog.py new post.md # create & publish from a file
python bearblog.py edit <uid> post.md # overwrite an existing post
Posts are header_content, body_content, publish, posted to /<blog>/dashboard/posts/new/ (and /<uid>/ to edit). Scraping a form-driven Django app instead of an API turned up five quirks:
- No API, so isolate the surface. Every endpoint lives in one
BearBlogclass at the top of the file. When the dashboard HTML shifts, there's one place to patch. - The login response lies.
django-allauthkeeps a logged-out session alive on/accounts/login/, so a 200 there means nothing. The real auth check is whether the session can reach the blog dashboard. - CSRF wants three things. Django needs the token in the form body and a matching
Refererand anOriginheader. Two of three still gets you a 403. - The header splits on CRLF only. Send the
key: valueheader block with bare\nand Bear Blog parses it as one line: the whole block lands intitleand the slug mangles. Normalize to\r\n. - HTTP 200 can mean rejected. Overflow the ~200-char header limit and the save fails, but the response is still 200. The only signal is a lightsalmon
<p>banner in the body. Grep for "has not been saved" and raise.
This post is a markdown file in post/, published with python bearblog.py new.