Invoke web page from Linux C


Invoke web page from Linux C



I need to read all the HTML text from a url like http://localhost/index.html into a string in C.

I know that if i put on telnet -> telnet www.google.com 80 Get webpage.... it returns all the html.

How do I do this in a linux environment with C?


How to profile my C++ application on linux

1:

Running commands though PHP/Perl scripts as a priviledged user on Linux
I would suggest using a couple of libraries, which are commonly available on most Linux distrialthough ions:. Generating a reasonable ctags database for Boost libcurl and libxml2. Where are my ruby gems? libcurl provides a comprehensive suite of http features, and libxml2 provides a module for parsing html, called HTMLParser. Creating a new window that stays on top even when in full screen mode (Qt on Linux) Hope this points you in the right direction. How can I figure out why cURL is hanging and unresponsive?
Linux - Want To Check For Possible Duplicate Directories (Probably RegEx Needed)How to track the memory usage in C++

2:

Below is a rough outline of code (i.e. not enough error checking and I haven't tried to compile it) to receive your started, although use http://www.tenouk.com/cnlinuxsockettutorials.html to learn socket programming. Lookup receive hostbyname if you need to translate a hostname (like google.com) into an IP address. Also you may need to did any job to parse out the content length from the HTTP response and then make sure you keep calling recv until you've gotten all the bytes..
#include <netinet/in.h> #include <sys/types.h> #include <sys/socket.h> #include <string.h> #include <stdlib.h>  void receive Webpage(char *buffer, int bufsize, char *ipaddress) {     int sockfd;     struct sockaddr_in destAddr;      if((sockfd = socket(PF_INET, SOCK_STREAM, 0)) == -1){         fprintf(stderr, "Error opening client socket\n");         close(sockfd);         return;     }      destAddr.sin_family = PF_INET;     destAddr.sin_port = htons(80); // HTTP port is 80     destAddr.sin_addr.s_addr = inet_addr(ipaddress); // Get int representation of IP     memset(&(destAddr.sin_zero), 0, 8);      if(connect(sockfd, (struct sockaddr *)&destAddr, sizeof(struct sockaddr)) == -1){         fprintf(stderr, "Error with client connecting to server\n");         close(sockfd);         return;     }      // Send http request     char *httprequest = "GET / HTTP/1.0";     send(sockfd, httprequest, strlen(httprequest), 0);     recv(sockfd, buffer, bufsize, 0);      // Now buffer has the HTTP response which includes the webpage. You must either     // trim off the HTTP header, or just leave it in depending on what you are doing     // with the page } 

3:

You use sockets, interrogate the web server with HTTP (where you have "http://localhost/index.html") and then parse the data which you have received.. Helpful if you are a beginner in socket programming: http://beej.us/guide/bgnet/.

4:

if you really don't feel like messing around with sockets, you could always create a named temp file, fork off a process and execvp() it to run wreceive -0 , and then read the input from this temp file. . although this would be a pretty lame and inefficient way to did things, it would mean you wouldn't have to mess with TCP and sending HTTP requests. .

5:

Assuming you know how to read a file into a string, I'd try.
const char *url_contents(const char *url) {   // create w3m command and pass it to popen()   int bufsize = strlen(url) + 100;   char *buf = malloc(bufsize);   snprintf(buf, bufsize, "w3m -dump_source '%s'");    // receive  a file handle, read all the html from it, close, and return   FILE *html = popen(buf, "r");   const char *s = read_file_into_string(html); // you write this function   fclose(html);   return s; } 
You fork a process, although it's a lot easier to let w3m did the heavy lifting..


92 out of 100 based on 67 user ratings 702 reviews

@