Add video ID based query and ReVanced stubs

This adds the endpoints ReVanced queries when opening the settings or watching
videos. The version of skipSegments it uses is actually implrmrnted, and the
others are stubbed out.

I still haven't managed to get ReVanced to actually *work* with this, but it at
least isn't getting any server errors sent to it now.

I've also added my learnings to the README.
This commit is contained in:
Adam Novak 2022-10-29 22:42:59 -04:00
parent 5e945d0914
commit a9683b08a0
4 changed files with 168 additions and 30 deletions

View file

@ -13,3 +13,51 @@ It also uses [sb-mirror](https://github.com/mchangrh/sb-mirror) for mirroring th
Feel free to add your instance to this list by making a pull request.
You can also configure Piped-Backend to use your mirror by changing the `SPONSORBLOCK_SERVERS` configuration value.
## Compatibility
This implementation does not implement the full SponsorBlock server API. It supports hash-based queries to `/api/skipSegments/<hash>`, with optional `categories` parameter, and queries to `/api/skipSegments` with required `videoID` and optional `categories` parameters.
The browser extension works with only the hash-based query endpoint, but other clients, such as the one in ReVanced, require the video ID endpoint, and additionally query `/api/userInfo` and `/api/isUserVip`. Right now there are stub implementations for these. ReVanced had not yet been verified as compatible.
## Building
To make a local release build, use `cargo build --release`. This will produce a binary in `target/release/sponsorblock-mirror`.
To make a Docker container, you need to do a BuildKit Docker build, not a normal Docker build. Make sure you have `buildx` available in your Docker, and run:
```bash
docker buildx build --load -t 1337kavin/sponsorblock-mirror .
```
## Using with Docker Compose
To run the server under Docker Compose, run:
```
docker compose up
```
This starts the API server, a database, and a mirroring service to download the SponsorBlock data from the `sponsorblock.kavin.rocks` mirror and keep it up to date.
The API will be available on `http://localhost:8000`. For example, you can try `http://localhost:8000/api/skipSegments/aabf` or `http://localhost:8000/api/skipSegments?videoID=eQ_8F4nzyiw`. **It will take a few minutes at least for the database to download and import,** so these will not return data on the first run.
## Troubleshooting
* If the linker complains about a missing `-lpq`, make sure you have the PostgresQL development libraries, which may be in a `libpq-dev` package or your distribution's equivalent.
* If Docker complains that `the --mount option requires BuildKit`, make sure you are building with `docker buildx build` and not `docker build`.
* If `docker compose` complains like this:
```
ERROR: The Compose file './docker-compose.yml' is invalid because:
Unsupported config option for volumes: 'postgres_data'
Unsupported config option for services: 'sb-mirror'
```
then you are using an old version of `docker compose` which does not fully support the Compose Specification and [requires a 'version' key to differentiate the file from a V1 compose file](https://docs.docker.com/compose/compose-file/#version-top-level-element). Try appending `version: "3"` to the file.
* On the first run of `docker compose`, even after the database files are downloaded, you may see errors like `could not open file "/mirror/sponsorTimes.csv" for reading: Permission denied`. Assuming the permissions on the `.csv` files are actually set to be world-readable, you might be able to fix this by stopping and restarting `docker compose`.
* To access the PosgresQL database directly, you can `docker exec -ti postgres-sb-mirror bash -c 'psql $POSTGRES_DB $POSTGRES_USER'`.
* Requests for videos not in the database are forwarded to `https://sponsor.ajay.app/`, which may be down or malfunctioning. A response of the string `Internal Server Error` is likely to be from there, rather than from this application.

View file

@ -14,7 +14,7 @@ use tokio::time::interval;
use structs::{Segment, Sponsor};
use crate::routes::skip_segments;
use crate::routes::{skip_segments, skip_segments_by_id, fake_is_user_vip, fake_user_info};
mod models;
mod routes;
@ -120,5 +120,5 @@ fn rocket() -> Rocket<Build> {
})
})
).attach(CORS)
.mount("/", routes![skip_segments])
.mount("/", routes![skip_segments, skip_segments_by_id, fake_is_user_vip, fake_user_info])
}

View file

@ -6,51 +6,135 @@ use rocket::response::content;
use crate::{Db, Segment, Sponsor};
use crate::models::SponsorTime;
use crate::schema::sponsorTimes::dsl::*;
// We *must* use "videoID" as an argument name to get Rocket to let us access
// the query parameter by that name, but if videoID is already used we
// can't do that.
use crate::schema::sponsorTimes::dsl::{
sponsorTimes,
shadowHidden,
hidden,
votes,
category,
hashedVideoID,
videoID as column_videoID
};
// init regex to match hash/hex
// init regexes to match hash/hex or video ID
lazy_static! {
static ref RE: regex::Regex = regex::Regex::new(r"^[0-9a-f]{4}$").unwrap();
static ref HASH_RE: regex::Regex = regex::Regex::new(r"^[0-9a-f]{4}$").unwrap();
static ref ID_RE: regex::Regex = regex::Regex::new(r"^[a-zA-Z0-9_-]{6,11}$").unwrap();
}
// Segments can be fetched either by full video ID, or by prefix of hashed
// video ID. Different clients make different queries. This represents either
// kind of constraint.
enum VideoName {
ByHashPrefix(String),
ByID(String),
}
#[get("/api/skipSegments/<hash>?<categories>")]
pub async fn skip_segments(
hash: String,
categories: Option<&str>,
db: Db,
) -> content::RawJson<String> {
let hash = hash.to_lowercase();
// Check if hash matches hex regex
if !RE.is_match(&hash) {
if !HASH_RE.is_match(&hash) {
return content::RawJson("Hash prefix does not match format requirements.".to_string());
}
let hc = hash.clone();
let search_result = find_skip_segments(VideoName::ByHashPrefix(hash.clone()), categories, db).await;
match search_result {
Some(segments) => return segments,
None => {
// Fall back to central Sponsorblock server
let resp = reqwest::get(format!(
"https://sponsor.ajay.app/api/skipSegments/{}?categories={}",
hash,
categories.unwrap_or("[]"),
))
.await
.unwrap()
.text()
.await
.unwrap();
return content::RawJson(resp);
}
}
}
#[get("/api/skipSegments?<videoID>&<categories>")]
pub async fn skip_segments_by_id(
#[allow(non_snake_case)]
videoID: String,
categories: Option<&str>,
db: Db,
) -> content::RawJson<String> {
if videoID.is_empty() {
return content::RawJson("videoID is missing".to_string());
}
let search_result = find_skip_segments(VideoName::ByID(videoID.clone()), categories, db).await;
match search_result {
Some(segments) => return segments,
None => {
// Fall back to central Sponsorblock server
let resp = reqwest::get(format!(
"https://sponsor.ajay.app/api/skipSegments?videoID={}&categories={}",
videoID,
categories.unwrap_or("[]"),
))
.await
.unwrap()
.text()
.await
.unwrap();
return content::RawJson(resp);
}
}
}
async fn find_skip_segments(
name: VideoName,
categories: Option<&str>,
db: Db,
) -> Option<content::RawJson<String>> {
let cat: Vec<String> = serde_json::from_str(categories.unwrap_or("[\"sponsor\"]")).unwrap();
if cat.is_empty() {
return content::RawJson(
return Some(content::RawJson(
"[]".to_string(),
);
));
}
let results: Vec<SponsorTime> = db.run(move |conn| {
let base_filter = sponsorTimes
.filter(shadowHidden.eq(0))
.filter(hidden.eq(0))
.filter(votes.ge(0))
.filter(hashedVideoID.like(format!("{}%", hc)));
let queried = {
if cat.is_empty() {
.filter(category.eq_any(cat)); // We know cat isn't empty at this point
let queried = match name {
VideoName::ByHashPrefix(hash_prefix) => {
base_filter
.filter(hashedVideoID.like(format!("{}%", hash_prefix)))
.get_results::<SponsorTime>(conn)
.expect("Failed to query sponsor times")
} else {
}
VideoName::ByID(video_id) => {
base_filter
.filter(category.eq_any(cat))
.filter(column_videoID.eq(video_id))
.get_results::<SponsorTime>(conn)
.expect("Failed to query sponsor times")
}
@ -115,21 +199,10 @@ pub async fn skip_segments(
if !sponsors.is_empty() {
let sponsors: Vec<&Sponsor> = sponsors.values().collect();
return content::RawJson(serde_json::to_string(&sponsors).unwrap());
return Some(content::RawJson(serde_json::to_string(&sponsors).unwrap()));
}
let resp = reqwest::get(format!(
"https://sponsor.ajay.app/api/skipSegments/{}?categories={}",
hash,
categories.unwrap_or("[]"),
))
.await
.unwrap()
.text()
.await
.unwrap();
return content::RawJson(resp);
return None;
}
fn similar_segments(segment: &Segment, hash: &str, segments: &Vec<SponsorTime>) -> Vec<Segment> {
@ -197,4 +270,20 @@ fn best_segment(segments: &Vec<Segment>) -> Segment {
}
best_segment
}
}
// We might need some more routes to support ReVanced. These are faked for now.
// Sadly not even these are sufficient to make it work for me, maybe it is just
// broken?
// This would take a userID
#[get("/api/isUserVIP")]
pub async fn fake_is_user_vip() -> content::RawJson<String> {
content::RawJson("{\"hashedUserID\": \"\", \"vip\": false}".to_string())
}
// This would take a userID and an optional list values
#[get("/api/userInfo")]
pub async fn fake_user_info() -> content::RawJson<String> {
content::RawJson("{\"userID\": \"\", \"userName\": \"\", \"minutesSaved\": 0, \"segmentCount\": 0, \"viewCount\": 0}".to_string())
}

View file

@ -1,6 +1,7 @@
// @generated automatically by Diesel CLI.
diesel::table! {
#[allow(non_snake_case)]
sponsorTimes (UUID) {
videoID -> Text,
startTime -> Float4,