HTTP (Hypertext Transfer Protocol) is a client-server communication protocol widely used for the web. The SimpleHTTPServer module in Python provides a way to create a web server in Python that can serve files from a directory. This module provides a basic web server with a minimal feature set, but it is often useful for quick development or testing.
The SimpleHTTPServer module default returns a directory listing in HTML format, sorted alphabetically, when the URL points to a directory. The default behavior is useful, but there are cases where it is desirable to sort the directory listing by modification date and show the modified date in a formatted way on the HTML page. This technical post will describe how we modified the SimpleHTTPServer module to accomplish this.
The SimpleHTTPServer module provides a function called list_directory, which generates the HTML directory listing. We first need to obtain the file modification times to sort the listing by modification date. We can use the os.path.getmtime(a)
function to retrieve this information for each file in the directory. Once we have the modification times, we can sort the file list using the sort()
function based on the modification times.
The modified list_directory
function is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | def list_directory( self , path): """Helper to produce a directory listing (absent index.html). Return value is either a file object, or None (indicating an error). In either case, the headers are sent, making the interface the same as for send_head(). """ try : list = os.listdir(path) except os.error: self .send_error( 404 , "No permission to list directory" ) return None list .sort(key = lambda a: os.path.getmtime(a), reverse = True ) f = StringIO() displaypath = cgi.escape(urllib.unquote( self .path)) f.write( '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">' ) f.write( "<html>\n<title>Directory listing for %s</title>\n" % displaypath) f.write( "<body>\n<h2>Directory listing for %s</h2>\n" % displaypath) f.write( "<hr>\n<ul>\n" ) for name in list : fullname = os.path.join(path, name) displayname = linkname = name # Append / for directories or @ for symbolic links if os.path.isdir(fullname): displayname = name + "/" linkname = name + "/" if os.path.islink(fullname): displayname = name + "@" # Note: a link to a directory displays with @ and links with / file_time = datetime.datetime.fromtimestamp(os.path.getmtime(fullname)) f.write( '<li>%s <a href="%s">%s</a>\n' % (file_time.strftime( "%Y-%m-%d, %H:%M" ), urllib.quote(linkname), cgi.escape(displayname))) f.write( "</ul>\n<hr>\n</body>\n</html>\n" ) length = f.tell() f.seek( 0 ) self .send_response( 200 ) self .send_header( "Content-type" , "text/html" ) self .send_header( "Content-Length" , str (length)) self .end_headers() return f |
We start by calling os.listdir()
to get the list of files in the directory. We then sort the list based on the modification time of each file using the key parameter of the sort()
function. The lambda function passed to the key parameter returns the modification time of each file using the os.path.getmtime(a)
function.
Next, we generate the HTML for the directory listing. We use the cgi.escape()
function to escape any characters that have special meanings in HTML. We then use the datetime.datetime.fromtimestamp()
function to convert the modification time to a DateTime.
Full Source Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 | """Simple HTTP Server. This module builds on BaseHTTPServer by implementing the standard GET and HEAD requests in a fairly straightforward manner. """ __version__ = "0.6" __all__ = [ "SimpleHTTPRequestHandler" ] import os import posixpath import BaseHTTPServer import urllib import cgi import shutil import mimetypes import datetime try : from cStringIO import StringIO except ImportError: from StringIO import StringIO class SimpleHTTPRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler): """Simple HTTP request handler with GET and HEAD commands. This serves files from the current directory and any of its subdirectories. The MIME type for files is determined by calling the .guess_type() method. The GET and HEAD requests are identical except that the HEAD request omits the actual contents of the file. """ server_version = "SimpleHTTP/" + __version__ def do_GET( self ): """Serve a GET request.""" f = self .send_head() if f: self .copyfile(f, self .wfile) f.close() def do_HEAD( self ): """Serve a HEAD request.""" f = self .send_head() if f: f.close() def send_head( self ): """Common code for GET and HEAD commands. This sends the response code and MIME headers. Return value is either a file object (which has to be copied to the outputfile by the caller unless the command was HEAD, and must be closed by the caller under all circumstances), or None, in which case the caller has nothing further to do. """ path = self .translate_path( self .path) f = None if os.path.isdir(path): if not self .path.endswith( '/' ): # redirect browser - doing basically what apache does self .send_response( 301 ) self .send_header( "Location" , self .path + "/" ) self .end_headers() return None for index in "index.html" , "index.htm" : index = os.path.join(path, index) if os.path.exists(index): path = index break else : return self .list_directory_by_date(path) ctype = self .guess_type(path) try : # Always read in binary mode. Opening files in text mode may cause # newline translations, making the actual size of the content # transmitted *less* than the content-length! f = open (path, 'rb' ) except IOError: self .send_error( 404 , "File not found" ) return None self .send_response( 200 ) self .send_header( "Content-type" , ctype) fs = os.fstat(f.fileno()) self .send_header( "Content-Length" , str (fs[ 6 ])) self .send_header( "Last-Modified" , self .date_time_string(fs.st_mtime)) self .end_headers() return f def list_directory_by_date( self , path): """Helper to produce a directory listing (absent index.html). Return value is either a file object, or None (indicating an error). In either case, the headers are sent, making the interface the same as for send_head(). """ try : list = os.listdir(path) except os.error: self .send_error( 404 , "No permission to list directory" ) return None list .sort(key = lambda a: os.path.getmtime(a), reverse = True ) f = StringIO() displaypath = cgi.escape(urllib.unquote( self .path)) f.write( '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">' ) f.write( "<html>\n<title>Directory listing for %s</title>\n" % displaypath) f.write( "<body>\n<h2>Directory listing for %s</h2>\n" % displaypath) f.write( "<hr>\n<ul>\n" ) for name in list : fullname = os.path.join(path, name) displayname = linkname = name # Append / for directories or @ for symbolic links if os.path.isdir(fullname): displayname = name + "/" linkname = name + "/" if os.path.islink(fullname): displayname = name + "@" # Note: a link to a directory displays with @ and links with / file_time = datetime.datetime.fromtimestamp(os.path.getmtime(fullname)) f.write( '<li>%s <a href="%s">%s</a>\n' % (file_time.strftime( "%Y-%m-%d, %H:%M" ), urllib.quote(linkname), cgi.escape(displayname))) f.write( "</ul>\n<hr>\n</body>\n</html>\n" ) length = f.tell() f.seek( 0 ) self .send_response( 200 ) self .send_header( "Content-type" , "text/html" ) self .send_header( "Content-Length" , str (length)) self .end_headers() return f def list_directory( self , path): """Helper to produce a directory listing (absent index.html). Return value is either a file object, or None (indicating an error). In either case, the headers are sent, making the interface the same as for send_head(). """ try : list = os.listdir(path) except os.error: self .send_error( 404 , "No permission to list directory" ) return None list .sort(key = lambda a: a.lower()) f = StringIO() displaypath = cgi.escape(urllib.unquote( self .path)) f.write( '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">' ) f.write( "<html>\n<title>Directory listing for %s</title>\n" % displaypath) f.write( "<body>\n<h2>Directory listing for %s</h2>\n" % displaypath) f.write( "<hr>\n<ul>\n" ) for name in list : fullname = os.path.join(path, name) displayname = linkname = name # Append / for directories or @ for symbolic links if os.path.isdir(fullname): displayname = name + "/" linkname = name + "/" if os.path.islink(fullname): displayname = name + "@" # Note: a link to a directory displays with @ and links with / f.write( '<li><a href="%s">%s</a>\n' % (urllib.quote(linkname), cgi.escape(displayname))) f.write( "</ul>\n<hr>\n</body>\n</html>\n" ) length = f.tell() f.seek( 0 ) self .send_response( 200 ) self .send_header( "Content-type" , "text/html" ) self .send_header( "Content-Length" , str (length)) self .end_headers() return f def translate_path( self , path): """Translate a /-separated PATH to the local filename syntax. Components that mean special things to the local file system (e.g. drive or directory names) are ignored. (XXX They should probably be diagnosed.) """ # abandon query parameters path = path.split( '?' , 1 )[ 0 ] path = path.split( '#' , 1 )[ 0 ] path = posixpath.normpath(urllib.unquote(path)) words = path.split( '/' ) words = filter ( None , words) path = os.getcwd() for word in words: drive, word = os.path.splitdrive(word) head, word = os.path.split(word) if word in (os.curdir, os.pardir): continue path = os.path.join(path, word) return path def copyfile( self , source, outputfile): """Copy all data between two file objects. The SOURCE argument is a file object open for reading (or anything with a read() method) and the DESTINATION argument is a file object open for writing (or anything with a write() method). The only reason for overriding this would be to change the block size or perhaps to replace newlines by CRLF -- note however that this the default server uses this to copy binary data as well. """ shutil.copyfileobj(source, outputfile) def guess_type( self , path): """Guess the type of a file. Argument is a PATH (a filename). Return value is a string of the form type/subtype, usable for a MIME Content-type header. The default implementation looks the file's extension up in the table self.extensions_map, using application/octet-stream as a default; however it would be permissible (if slow) to look inside the data to make a better guess. """ base, ext = posixpath.splitext(path) if ext in self .extensions_map: return self .extensions_map[ext] ext = ext.lower() if ext in self .extensions_map: return self .extensions_map[ext] else : return self .extensions_map[''] if not mimetypes.inited: mimetypes.init() # try to read system mime.types extensions_map = mimetypes.types_map.copy() extensions_map.update({ ' ': ' application / octet - stream', # Default '.py' : 'text/plain' , '.c' : 'text/plain' , '.h' : 'text/plain' , }) def test(HandlerClass = SimpleHTTPRequestHandler, ServerClass = BaseHTTPServer.HTTPServer): BaseHTTPServer.test(HandlerClass, ServerClass) if __name__ = = '__main__' : test() |
This post is also available in: Greek