golang http.Get() only returning a portion of html body

Question

CONTEXT: My goal is to use http.Get(), then use the golang.org/x/net/html package to parse the resp.Body and extract some bits of data from some <div>'s that all use a similar naming scheme for their id attributes, which I will match with a regex. The webpage is https://mandarintemple.com/learning-materials/radicals/

PROBLEM: I seem to only get a portion of the total html body. When looking in the network tab of dev tools, there are a lot of GET requests that take place, but only the first is of type html, all others are css or js. When I look in the inspector tab of dev tools, I can see the <div>'s I want inside the <body>, but I have used io.ReadAll(resp.Body) and printed it to my console (of my editor) and clearly could see that those <div>'s were not there.

I'm guessing that one or more of the js scripts are creating and adding the <div>'s I want, rather than them being present in the original html doc it responds with (They are popups you get when hovering over a Hanzi). Is there an easy way to verify this? As far as I can tell, the <div>'s are part of the html body, but this is the only way I can explain not getting them in the response body from my http.Get() since I'm not getting any errors.

Whatever the cause is, I am looking for a way to get those popup <div>'s in my response from http.Get(). If someone can help me understand or point me to some resources to checkout, that is greatly appreciated.

Here is the applicable code from func main() just to clarify what I have said above:

    resp, err := http.Get("https://mandarintemple.com/learning-materials/radicals/")
    if err != nil {
        log.Println("failed to get resource via url with error: ", err)
    }
    defer resp.Body.Close()

    readBody, err := io.ReadAll(resp.Body)
    if err != nil {
        log.Println("failed to read body with error: ", err)
    }
    bodyString := string(readBody)
    log.Println(bodyString)
    // I can clearly see that the <div>'s aren't present in this output
    // even though much of the rest of the <body> appears the same as in the inspector of dev tools.

blami · Accepted Answer · 2025-10-07 01:56:33Z

2

As you correctly assume, the page uses JavaScript to generate this content. What you get with http.Get() is the static portion of page as served by the webserver, in Developer Tools window that would match the content delivered to you as response to GET https://mandarintemple.com/learning-materials/radicals/ of type text/html.

To get any <div> or content generated dynamically by JavaScript (that happens in users browser and not on server) you would need to fetch and run that JavaScript on top of that static content. Given complexity of such task "headless" browsers (a browsers running without visible window) are often used and Go code interacts with them programmatically and tells them what page to load and which content to get.

edited Oct 7 at 1:56

answered Oct 7 at 1:48

blami

7,7642 gold badges32 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Zohanannas Setyawan Oct 7 at 1:59

readBody, err := io.ReadAll(resp.Body) if err != nil { log.Println("failed to read body with error: ", err) }

Collectives™ on Stack Overflow

golang http.Get() only returning a portion of html body

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related