- [Loading the home page](https://recloudstream.github.io/dokka/-cloudstream/com.lagradost.cloudstream3/-main-a-p-i/index.html#1356482668%2FFunctions%2F101969414)
- [Loading the show page](https://recloudstream.github.io/dokka/-cloudstream/com.lagradost.cloudstream3/-main-a-p-i/index.html#1671784382%2FFunctions%2F101969414)
- [Loading the video links](https://recloudstream.github.io/dokka/-cloudstream/com.lagradost.cloudstream3/-main-a-p-i/index.html#-930139416%2FFunctions%2F101969414)
If you are unfamiliar with the concept of scraping, you should probably start by reading [this guide](scraping/gettingstarted/) which should hopefuly familiarize you with this technique.
This one is probably the easiest, based on a query you should return a list of [SearchResponse](https://recloudstream.github.io/dokka/app/com.lagradost.cloudstream3/-search-response/index.html)
Scraping the search results is essentially just finding the search item elements on the site (red box) and looking in them to find name, url and poster url and put the data in a SearchResponse.
In this code snippet I have separated the Element to SearchResult conversion to a separate function because that function can often be used when scraping the home page later. No need to parse the same type of element twice.
## 2. Loading the home page
Getting the homepage is essentially the same as getting search results but with a twist: you define the queries in a variable like this:
```kotlin
override val mainPage = mainPageOf(
Pair("1", "Recent Release - Sub"),
Pair("2", "Recent Release - Dub"),
Pair("3", "Recent Release - Chinese"),
)
```
This dictates what the getMainPage function will be receiving as function arguments.
Basically when the recent dubbed shows should be loaded the getMainPage gets called with a page number and the request you defined above.
```kotlin
override suspend fun getMainPage(
page: Int,
request : MainPageRequest
): HomePageResponse {
// page: An integer > 0, starts on 1 and counts up, Depends on how much the user has scrolled.
// request.data == "2"
// request.name == "Recent Release - Dub"
```
With these variables you should fetch the appropriate list of Search Response like this:
```kotlin
// Gogoanime
override suspend fun getMainPage(
page: Int,
request : MainPageRequest
): HomePageResponse {
// Use the data you defined earlier in the pair to send the request you want.
val params = mapOf("page" to page.toString(), "type" to request.data)
The show page is a bit more complex than search results, but it uses the same logic used to get search results: using CSS selectors and regex to parse html into a kotlin object. With the amount of info being parsed this function can get quite big, but the fundamentals are still pretty simple.
The only difficultuy is getting the episodes, they are not always not part of the html. Check if any extra requests are sent in your browser when visiting the episodes page.
**NOTE**: Episodes in CloudStream are not paginated, meaning that if you have a show with 21 seasons, all on different website pages you will need to parse them all.
This is usually the hardest part when it comes to scraping video sites, because it costs a lot to host videos.
As bandwidth is expensive video hosts need to recuperate their expenses using ads, but when scraping we bypass all ads.
This means that video hosts have a big monetary incentive to make it as hard as possible to get the video links.
This means that you cannot write just one piece of skeleton code to scrape all video hosts, they are all unique.
You will need to customized scrapers for each video host. There are some common obfuscation techniques you should know about and how to detect them.
### Obfuscation techniques to know about:
**Base64**:
This is one of the most common obfuscation techniques, and you need to be able to spot it inside documents. It is used to hide important text in plain view.
It looks something like this: `VGhpcyBpcyBiYXNlNjQgZW5jb2RlIHRleHQuIA==`
A dead giveaway that it is base64 or something similar is that the string ends with `==`, something to watch out for, but not required. If you see any suspicious string using A-z in both uppercase and lowercase combined with some numbers then immediately check if it is base64.
**AES encryption:**
This is the more annoying variant of Base64 for our purposes, but less common. Some responses may be encrypted using AES and it is not too hard to spot.
Usually encrypted content is encoded in Base64 (which decodes to garbage), which makes it easier to spot. Usually sites are not too covert in the use of AES, and you should be alerted if any site contains references to `enc`, `iv` or `CryptoJS`. The name of the game here is to find the decryption key, which is easiest done with a debugger. If you can find where the decryption takes place in the code, usually with some library like `CryptoJS` then you can place a breakpoint there to find the key.