Today I released a new version of the COS library, best known for its MultipartRequest and MultipartParser file upload components. This release adds a few browser workarounds, a new exception subtype, and support for esoteric platforms.
It's available at http://servlets.com/cos.
Some people have been asking me when and where I'll be speaking in 2005, so here's a short rundown.
February 22: I'm speaking at the SDForum in Palo Alto, CA, on XQuery and Web Services.
March 3 and 4: I'm speaking at The Server Side conference in Las Vegas, NV. I'm giving two talks, one titled Open Source from the Inside and another titled just XQuery.
March 17: I'm speaking at the SD West conference in Santa Clara, CA, presenting an XQuery Case Study.
May 25: I'm speaking at XTech 2005 in Amsterdam presenting a talk XQuery Search and Update where I'll show how XQuery can support search and update based on our experiences at Mark Logic implementing these features.
August 1-5: I'm likely to speak at the O'Reilly Open Source Convention in Portland, OR. The topics haven't been chosen.
Then throughout the year in about 12 different cities in the US I'll be part of the No Fluff Just Stuff tour. This is my fourth year on the tour. At each venue there are about 10 really excellent speakers. I usually speak on about 5 or 6 topics, with new ones every year. This year I'll be presenting: "Open Source from the Inside", "An Introduction to XQuery", "XQuery By Example: Advanced Web Publishing", "Java Metadata", "Extreme Web Caching", and "Forgotten Algorithms II". The likely cities where I'll be speaking are: Houston, Minneapolis, Denver, Reston, Raleigh, Austin, San Diego, Phoenix, Salt Lake City, Dallas, Chicago, Seattle, and then Denver again.
Recently I started getting a number of bogus ISP reviews submitted to the Servlets.com ISP review system. They're just a long list of links, the same type of junk you see being posted to blog comment systems in an attempt by the spammer to improve their sites' Google rankings. It's the kind of stuff that made me shut down the comment system on this blog. My guess is the ISP review form looks enough like the MovableType (or some other software's) comment form to look like a target to the clients. I moderate all new posts so they never get seen, but that doesn't stop the bots from trying.
So what I did is simple, I made more fields required including ones the robots don't know to fill in. Now I'm wondering...
Is it time to use less obvious form names? "comment" is nicely mneumonic, but perhaps it should be off limits now.
How long before someone writes a spider client that fills in random but heuristically intelligent values to form fields on the web? It could even go so far as to enter CRC-compliant credit card numbers into order forms, if the form field names are somewhat standard. I'm wondering if in the future we'll have to fill out more "what is the word hidden in this graphic", are-you-really-a-human tests before long.
Hmm, there's probably big (but dirty) money to be made in writing an automatic word recognizer. Then you can make big (and clean) money writing the next generation are-you-human test.
Another loss of innocence for the net...
I moved Servlets.com and JDOM.org from Colorado to Oregon this week. With any luck you hardly noticed.
The reason is I dediced to abandon Firstlink, my Colorado ISP, because their quality had started to suffer. For my new provider I've chosen Kattare, based in Oregon, who has scored the most consistently high marks on the Servlets.com ISP rating pages. Kattare is also conveniently located as they're based in my old hometown. I can meet the guys for a beer while visiting my folks.
Now, truth be told, I absolutely hate moving machines. There's so many loose ends you have to watch out for. The one thing I enjoy about a move though is the chance to make serious upgrades on the software. Before a new box goes live, you can experiment and break things without danger and flip the switch only after you've leisurely cleaned up your mess.
As part of the move I've upgraded from Tomcat 3.2.1 to Jetty 4.2.21. Even though I helped develop the earliest version of Tomcat and have an affinity toward it, I decided to hop off the Tomcat 4.x and 5.x upgrade path. In these later versions Tomcat is no longer the lean alley cat it once was. It's gotten a bit "thick in the middle".
Here's the Tomcat 5.x dependencies as pulled from gump: ant xml-xerces jaf javamail jmx jsse xml-xalan2 jakarta-tomcat-coyote jakarta-tomcat-catalina jakarta-tomcat-jasper_tc5 jakarta-tomcat-util jakarta-servletapi-5-servlet jakarta-servletapi-5-jsp jakarta-regexp commons-el commons-modeler txt2html-task jakarta-struts junit jndi jakarta-site2 jdbc jaas jta ldap commons-beanutils commons-fileupload commons-launcher commons-collections commons-digester commons-dbcp commons-pool commons-logging commons-daemon xml-apis.
Here's the Jetty 4.x dependencies: ant commons-logging xml-xerces mx4j xml-apis.
It's 35 to 5, by my count. In the game of Bloat, lower scores win.
Apache friends I trust said they've had much better luck with Jetty and its more focused design, so I followed their advice. I've been very pleased. It's easy to configure, starts very quickly, and has great documentation. Only a few of the book examples aren't working yet, mostly those related to security. Servlets are portable but their security configurations (i.e. user lists) aren't.
While I had the box cracked open, I upgraded from Apache httpd 1.3 to Apache httpd 2.0. This was mostly so I could also upgrade from CVS to Subversion (which uses the Apache 2.0 WebDAV protocol). I've been using Subversion for the last couple months to setup and manage http://xqzone.marklogic.com and it's been a joy. The Subversion designers and developers have done a great job. I haven't moved JDOM to Subversion yet, but I want to.
Mailman got an upgrade too, to 2.1.5 which lets me directly reject any non-subscriber mail. The problem with Mailman (even 2.1.5) is that it doesn't let posts time out. I like the ezmlm way of doing things where the moderator gets an email with each new post, and if the moderator takes no action it's discarded in two weeks. So as moderator you only have to take an action to OK a valid post (seemingly 1 out of 100 times). Of course the downsize of ezmlm is that there's no way to utilize a group of moderators to see which posts someone else has already dealt with. Ah, they all suck I guess.
P.S. I'm again allowing comments on my blog posts. As part of the move I found the MovableType option to strip out HTML from the comments, so that should remove the incentive for spammers to post links. Of course, their bots may still try and we may get a lot of crufty posts. We'll see.
If you're looking for "hot teen sex", I'm sorry but you're going to have to look somewhere else. The latest thing these days, as you probably know if you're a regular blog reader, is for spammers to automatically add comments to blogs that have links to their own crap pages. Sometimes hundreds of links. They do this in the hope that Google will spider this blog, find these links, and from the links artificially raise the PageRank of the crap site.
Each time this blog gets hit I have to go through all the entries manually and remove the offending comments. MovableType, the version I have installed right now, doesn't make this easy. I guess the Trotts who wrote MT were optimists and thought only about providing value. But these days you've gotta think like a spammer: "Hey, this technology is useful! How could we abuse it and make it go away?"
It wasn't a big problem here because I changed the comment form tags a little to make the form harder for spiders to spot, but the script kiddies have gone general, and yesterday I had hundreds of fake comments for "hot teen sex" and "las vegas". I don't want to deal with it, so until I upgrade MovableType to a version that has some spam blocking abilities, I'm turning off comments. Sadly, this means old comments aren't viewable either.
What do you all think about this? I guess I won't know.
Oracle Magazine just published an article of mine called XQuery Tricks and Traps. This article, the third in a series, focuses on the important but tricky and commonly misunderstood aspects of the XQuery language. I expect it will be of interest to newbies and experts alike.
http://otn.oracle.com/oramag/oracle/04-jan/o14dev_xml.html
I stumbled on a new RSS news reader tonight, and it's fantastic. If you're using Windows check out FeedDemon.
It's a funny story how I stumbled upon it: My laptop was suddenly sluggish so I went to the Task Manager to check who was hogging memory. The worst offender was SharpReader (my former RSS newsreader) taking up a ludicrous 80 Megs. I killed it and griped how an RSS reader designed to experiment with .NET ends up consuming 80 Megs when minimized. The system got faster, but I was in the mood to kill processes. You know how it is, once you get the taste of blood. I noticed QCTRAY.EXE eating some memory and wondered what sort of animal it was. I searched the hard drive. Among other places found it in c:\windows\prefetch. That distracted me enough from the hunt to wonder what a prefetch dir is for. I googled it and found lots of info including a utility at http://www.majorgeeks.com to change the Windows prefetch behavior. I'm a Major Geek (as you can no doubt tell by the story thus far) so I went exploring what else was there. I stumbled on FeedDemon and since I was just griping about SharpReader (and because FeedDemon had a killer screenshot) I decided to give it a try. Now at the end of that odd path, I've got a nice new freeware newsreader that's stylish and feature-rich and lives in just 6 Megs.
I've improved the Servlets.com Soapbox page to list all the articles I've been publishing lately and some interesting historical articles (like my news article breaking the news on the formation of Apache Jakarta).
I added and updated several dozen ISP listings tonight. To those who submitted the ISPs, please come back and add reviews. I know the reviews have meant a lot to people shopping for ISPs.
The top story on JavaWorld is one of mine, covering what's changed between Servlet API 2.3 and 2.4. Enjoy!
Ryan purchased a big hosting package for a large project that got cancelled. Rather than let the space sit idle, he wrote to me with an offer to subdivide it and host people's programming-related sites for free. This is his way to "give back" he tells me. Yep, it has to be programming-related. And you'll be asked to have a banner but nothing more.
If you're interested, look for "Ryan's Freebie" on the ISP listing page. His and all other free ISPs are marked with a red "Free!".
This release includes numerous file upload enhancements, bug fixes, and workarounds for various browsers. Among the most important:
- Resolved an issue where exceptions were thrown if a boundary hit
just at the end of the 64k read buffer.
- Enhanced MultipartRequest to include query string parameters in
its parameter list.
- Added support to MultipartParser for browsers that send preambles.
- Added a MultipartParser constructor that takes a character encoding.
- Added support for unquoted header values (helpful to lynx browsers).
- Added support for file names as part of the Content-Type header
(helpful to Opera browsers).
- Made DefaultFileRenamePolicy thread-safe.
I also enhanced Base64Encoder/Decoder and the MailMessage class, and added new entries to the FAQ and Servlet Bugs You Need to Know About listings.
Nearly all these improvements came as suggestions from users. My sincere thanks to everyone who wrote in with ideas and problem diagnoses. You'll see your names in the code. Keep the ideas coming!
I just posted a new poll question on Servlets.com asking, "Would you buy a Java Servlet Programming, 3rd Edition, that covers the upcoming Servlet API 2.4?" API 2.4 is getting close to a release, and it's time for me to decide if I'm going to rearrange my life again to crank out a new edition. I'd hate to do it if there's not interest. So please write in and let me know!
Our poll on .NET had nearly 4,000 respondants. A little over half said, "No way!" they wouldn't consider .NET. About a quarter said it's possible. 7% were getting started soon, and a brave 12% were already using .NET. As for myself, my contract for .NET development got delayed but should start up again in the next month or two. We're going to look at how well J# lets us leverage our Java codebase. I'll blog the results.
To vote on the 3rd edition:
http://www.servlets.com/polls/question.tea?name=edition3
To view the .NET results:
http://www.servlets.com/polls/results.tea?name=dotnet
The May release of the com.oreilly.servlet MultipartRequest file upload tool added support for pluggable file renaming logic. Now there's an update that polishes that feature to provide the file name both before and after renaming. In this version getFilesystemName() returns the name as stored on the filesystem and there's a new method getOriginalName() to return the original file name. Enjoy! Get the software here.
Now available is a new com.oreilly.servlet release containing several substantial improvements in its file upload capabilities.
First, I've added support for internationalized filenames and parameter values. Previously all filenames were assumed Latin-1, but that only covers Western European languages. Now the library can support file names in any Java-supported encoding. This feature has been high on the wish list of Chinese, Japanese, and Korean programmers.
Second, I've added support for pluggable file renaming logic, so that if two files are uploaded concurrently to the same directory, the MultipartRequest class will notice and execute programmer-specified logic. One standard implementation is available in the package; it appends increasing integers to the filename.
Lastly, numerous bugs have been fixed. Most noticeable is an improvement in header handling to better support the Opera browser.
I hope you enjoy it. Please respect the license.
Servlets.com has two new mailing lists for those interested in receiving notification when new Servlets.com weblog entries are added. The first list provides instant notification, the second provides once-a-week notification of all entries added over that week. I've been asked for this feature by a few people, so I'm hoping people find it useful. You can subscribe here.
FYI, I'm using some home-grown software to manage the mail generation. It can pull data from any RSS or Meerkat feed and -- driven from an XML configuration file -- will send announcements at specified time intervals to any address as these feeds update. Comment to this blog entry if you want to take a look. It's darn useful.
Also, as of today all Servlets.com lists are hosted by Mailman. I've used Mailman on jdom.org with great success, and I think Mailman is easier to manage than ezmlm.
Hope you enjoy the lists!
To support the growing size of Servlets.com, I've added the ability to search for pages across the site. At the upper left of the standard nav bar you'll now see a simple but powerful Search box. Just enter your keyword(s), hit enter, and you're good! It's driven by Google's ultra-cool Site Search functionality.
You won't see it on blog pages like this one until I templatize the blogger. Go to the front page to try it out right now.
I just added a high-profile bug to the "Servlet Bugs You Need to Know About" list concerning URLConnection.getInputStream() in JDK 1.3. Seems the method does not return until the server completes the response and closes the socket connection, causing serious headaches for client-server applications. The bug list has the details and workarounds.
As the mail archive has grown (we're over 150,000 messages now) the front page archive listing has gotten slower, so I've changed the ViewLists servlet to cache the message counts for 24 hours. All hits after the first should be nearly instantaneous.
If you find other parts of the site slow, please comment below and I'll look at fixing them.
Added ReportMill to the Tools listing. It's a GUI page design tool with an ability for its pages to act as templates for servlets and be output as PDF or Flash. Slick. Thanks to Java Web Start I could run the demo by clicking on a link. Very slick.
Today I added a new feature to Servlets.com: Web logging ('blogging). I always wanted Servlets.com to have a "What's New" feature for the little things I add everyday, but it's a serious pain in the wrist to change the front page and a secondary page for each entry, much less to manage archives. A weblog is just the ticket!
I'm also going to use this weblog to share some of the interesting things I hear and ponder about. Categories I've set up include Java, Open Source, XML, Web Services, and .NET.
Wait a second! Did he just say .NET?!
Yes, I'm taking a tour of the dark side. I have a new contract to help port a Java-based web application to .NET, and I'm looking forward to airlifting myself into an entirely foreign, possibly hostile land. I've maintained a work journal since I started writing "Java Servlet Programming" four years ago, and I've always found it tremendously useful as a knowledge repository. This time I'm going public with my journal. Starting from zero, you and I together will see what happens when a Java programmer tries to become a .NET programmer.