aboutsummaryrefslogtreecommitdiff
path: root/docs/INTERNALS
diff options
context:
space:
mode:
Diffstat (limited to 'docs/INTERNALS')
-rw-r--r--docs/INTERNALS140
1 files changed, 140 insertions, 0 deletions
diff --git a/docs/INTERNALS b/docs/INTERNALS
new file mode 100644
index 000000000..0badf5b29
--- /dev/null
+++ b/docs/INTERNALS
@@ -0,0 +1,140 @@
+ _ _ ____ _
+ ___| | | | _ \| |
+ / __| | | | |_) | |
+ | (__| |_| | _ <| |___
+ \___|\___/|_| \_\_____|
+
+INTERNALS
+
+ The project is kind of split in two. The library and the client. The client
+ part uses the library, but the library is meant to be designed to allow other
+ applications to use it.
+
+ Thus, the largest amount of code and complexity is in the library part.
+
+Windows vs Unix
+===============
+
+ There are a few differences in how to program curl the unix way compared to
+ the Windows way. The four most notable details are:
+
+ 1. Different function names for close(), read(), write()
+ 2. Windows requires a couple of init calls
+ 3. The file descriptors for network communication and file operations are
+ not easily interchangable as in unix
+ 4. When writing data to stdout, Windows makes end-of-lines the DOS way, thus
+ destroying binary data, although you do want that conversion if it is
+ text coming through... (sigh)
+
+ In curl, (1) and (2) are done with defines and macros, so that the source
+ looks the same at all places except for the header file that defines them.
+
+ (3) is simply avoided by not trying any funny tricks on file descriptors.
+
+ (4) is left alone, giving windows users problems when they pipe binary data
+ through stdout...
+
+ Inside the source code, I do make an effort to avoid '#ifdef WIN32'. All
+ conditionals that deal with features *should* instead be in the format
+ '#ifdef HAVE_THAT_WEIRD_FUNCTION'. Since Windows can't run configure scripts,
+ I maintain two config-win32.h files (one in / and one in src/) that are
+ supposed to look exactly as a config.h file would have looked like on a
+ Windows machine!
+
+Library
+=======
+
+ There is a few entry points to the library, namely each publicly defined
+ function that libcurl offers to applications. All of those functions are
+ rather small and easy-to-follow, accept the one single and do-it-all named
+ curl_urlget() (entry point in lib/url.c).
+
+ curl_urlget() takes a variable amount of arguments, and they must all be
+ passed in pairs, the parameter-ID and the parameter-value. The list of
+ arguments must be ended with a end-of-arguments parameter-ID.
+
+ The function then continues to analyze the URL, get the different components
+ and connects to the remote host. This may involve using a proxy and/or using
+ SSL. The GetHost() function in lib/hostip.c is used for looking up host
+ names.
+
+ When connected, the proper function is called. The functions are named after
+ the protocols they handle. ftp(), http(), dict(), etc. They all reside in
+ their respective files (ftp.c, http.c and dict.c).
+
+ The protocol-specific functions deal with protocol-specific negotiations and
+ setup. They have access to the sendf() (from lib/sendf.c) function to send
+ printf-style formatted data to the remote host and when they're ready to make
+ the actual file transfer they call the Transfer() function (in
+ lib/download.c) to do the transfer. All printf()-style functions use the
+ supplied clones in lib/mprintf.c.
+
+ While transfering, the progress functions in lib/progress.c are called at a
+ frequent interval. The speedcheck functions in lib/speedcheck.c are also used
+ to verify that the transfer is as fast as required.
+
+ When the operation is done, the writeout() function in lib/writeout.c may be
+ called to report about the operation as specified previously in the arguments
+ to curl_urlget().
+
+ HTTP(S)
+
+ HTTP offers a lot and is the protocol in curl that uses the most lines of
+ code. There is a special file (lib/formdata.c) that offers all the multipart
+ post functions.
+
+ base64-functions for user+password stuff is in (lib/base64.c) and all
+ functions for parsing and sending cookies are found in
+ (lib/cookie.c).
+
+ HTTPS uses in almost every means the same procedure as HTTP, with only two
+ exceptions: the connect procedure is different and the function used
+
+ FTP
+
+ The if2ip() function can be used for getting the IP number of a specified
+ network interface, and it resides in lib/if2ip.c
+
+ TELNET
+
+ Telnet is implemented in lib/telnet.c.
+
+ FILE
+
+ The file:// protocol is dealt with in lib/file.c.
+
+ LDAP
+
+ Everything LDAP is in lib/ldap.c.
+
+ GENERAL
+
+ URL encoding and decoding, called escaping and unescaping in the source code,
+ is found in lib/escape.c.
+
+ While transfering data in Transfer() a few functions might get
+ used. get_date() in lib/getdate.c is for HTTP date comparisons.
+
+ lib/getenv.c is for reading environment variables in a neat platform
+ independent way. That's used in the client, but also in lib/url.c when
+ checking the PROXY variables.
+
+ lib/netrc.c keeps the .netrc parser
+
+ lib/timeval.c features replacement functions for systems that don't have
+
+ A function named curl_version() that returns the full curl version string is
+ found in lib/version.c.
+
+Client
+======
+
+ main() resides in src/main.c together with most of the client
+ code. src/hugehelp.c is automatically generated by the mkhelp.pl perl script
+ to display the complete "manual" and the src/urlglob.c file holds the
+ functions used for the multiple-URL support.
+
+ The client mostly mess around to setup its config struct properly, then it
+ calls the curl_urlget() function in the library and when it gets back control
+ it checks status and exits.
+