Unixwiz.net - Steve Friedl's Weblog: August 2002 Archives

August 29, 2002

"Pornolize" for network testing - who knew?

Kasia recently mentioned the delightful www.pornolize.com service, which takes a URL as input, fetches the page, and produces an (ahem) slightly modified version. It's just hilarious to apply to your own page, and it's been a humor standby for some time.

But it's also a network testing tool: Wow!

Onsite at a customer installing a firewall, I had to test whether the internal web site was reachable from the outside through the firewall. My PalmPilot wasn't working, and nobody whose number I remembered was available. I didn't care to cool my heels, and then it hit me: pornolize!

Pornolize has to fetch the requested page just like somebody with a browser would, so if it's able to display an inappropriate-for-children version of the web page found on the internal server, then the firewall must have been set up correctly. Necessity is the clearly mother of invention.

Friends have pointed me to other services, such as the Altavista translator or an HTML validator, but these aren't my fun friends :-)

Who knew?

Posted by Steve at 08:53 AM

August 27, 2002

Unixware porting woes

If there were ever a great argument for Linux, it's that open source software "just works". This is not - and has never been - the case with SCO UnixWare, and I cannot believe how much time I and another consultant have burned trying to get XML and SOAP software working on it. Ugh.

Our goal was to get SOAP::Lite (a great perl package for talking SOAP) and Apache xerces (for doing XML validations), and it's been one nightmare after another, much of which will ultimately prove to be non-billable time. "Non-billable time" are really sucky words in the consulting industry.

My consultant friend Kasia tried to get the xerces Java version working on OpenUNIX 8 (which is really just Unixware 7.1.2), but the Java runtime (1.3.0) had what appear to be broken JVM issues. Since we ultimately have to move this to the production machine running an even older Unixware (7.1.1) with an even older JVM (1.2.2), we did not think we were ultimately going to get anywhere on this.

So I tore into the C++ version of Xerces, and after a day's work came to the conclusion that the SCO UDK compiler simply was not up to the task. Xerces pretended to have support for Unixware in the configuration setup, but it was absolutely incomplete (for our version, at least), and it took hours to figure out where to add the stuff that was required. Then, once the object files started compiling, it then blew up in the "prelink" process in a certain module. I believe this involved dealing with template instantiation, and of course the compiler docs don't talk about the failure condition.

Fine, so we can't use the UDK: I had the customer install gcc 2.95.2pl1. I made many of the same changes to the Xerces source as I did before, plus had to even modify /usr/include/unistd.h for a compiler thing I couldn't work around. But then this blows up during linktime. Great0.

So a bit of rooting around the SCO web site shows a more recent gcc (2.95.3pl1), and installing this seems to do the trick. Now the library builds without error, and I'll wait until Kasia gets online to have her build the validator. We still have to modify the validator to have an XSD path, but that's not until we get the "base" part working. Then we see if the binary runs on the production Unixware 7 machine.

We had plenty of other issues along the same lines, but this gives a good flavor for how software is not portable unless it actually ports, and this has just been a lousy experience all around.

UnixWare is a tremendous commercial operating system: great commercial support, runs multiprocessors very well, and supports very large loads (my customer has 200 users).

We're still not finished with it yet, and I hope we get it working before it's time to retire.

Posted by Steve at 01:07 PM

August 25, 2002

Don't use /etc/profile !

People commonly put "important" environment variable settings in the /etc/profile file, but in lots of cases this is a mistake. But it's not usually obvious right away.

/etc/profile is processed at login time for all users, and of course anything set here is in effect for those login sessions. Things like the terminal type ($TERM) and the user command path ($PATH) are all appropriately set here.

But what about cron jobs? These do not get the benefit of /etc/profile, so any key environment variables (say, the java $CLASSPATH) have to be set more than once. This is a maintenance nightmare.

Better is to park the environment variable settings in a small file - say /etc/default/classpath - and source them where needed. From /etc/profile, from the cron jobs, anywhere. Then the file need be maintained just once, which gives future admins a fighting chance at getting it right.

In sh/ksh/bash, a file is sourced with the peculiar "dot" command:

. /etc/default/classpath

and it's processed as if it were included in the calling file.

This is a much more maintainable mechanism for dealing with systemwide variables.

Posted by Steve at 10:40 PM

August 22, 2002

"vim" does windows!

I've used vi for around 20 years, and I remember working on a platform that was too old for even vi (an Onyx Z8000 system running Version 7). Lots of people love to hate vi, but the "vi Improved" - "vim" - has all kinds of goodies that aren't always so obvious. I was chatting with a friend today and commented about the multi-window support that character-mode vim has.

"Really?"
"You bet."

When in vim, there are plenty of commands that work with split windows, but in practice you only need a couple to get real benefits.

control-W s - split current file into two windows

control-W n - create new (empty) window

control-W w - go to next window

Those are the big three. Once you're in a window, you can close it with the traditional :q just like closing any other kind of file, but this time it closes the window. Closing the last window exits vim entirely.

When you're in a window, you can :e filename to replace the current file with a new one. This works particularly well when you're in a blank window and want to edit an actual file rather than create a new one.

It seems that yank buffers are shared, so if you y a line in one window, going to another window (even a different file) will "paste" that line there. Numbered yank buffers are shared too.

It's worth taking a few minutes to get handy with this, because it really does become second nature after a short time, and it's so helpful that you'll wonder why you didn't learn this sooner.

Posted by Steve at 09:30 PM

August 21, 2002

How to run an unproductive web partnership

A customer of mine is setting up a partnership with a web services provider to exchange data over the internet using XML/SOAP, and I got roped in to do the communications part. I've never done this before, but I figured that with all the great toolkits out there I could wing it by plugging stuff together. I'm good enough with perl and problem solving that I figured I could do it.

I ultimately did get it working, but only after much more effort than I should have. These are some tips on how to make it hard for your partners if you're a WSP.

1) Use non-standard extensions

This WSP is using MIME attachments with SOAP messages, and I understand this to be not really common. None of the tools I've used know how to work with them, so it means rolling your own. This is a serious increase in complexity because it means you have to dig into the toolkits to really understand what's going on inside. In their case I'm not sure there was any around using attachments, but this was a complicating factor.

In particular, XML Spy won't read the WSDL because of this, which completely undermined my attempts to use the SOAP debugger to learn this. So as fallback I had to do it the hard way.

2) Don't comment on known-to-work configurations

The WSP must be working with multiple partners, and it's hard to believe they don't get feedback on what works and what doesn't. I'd love to know "Avoid XXX" or "YYY works if you know what you're doing". Even anecdotal reports would be better than starting in the middle of an empty field. The only thing they have said is "We don't recommend any particular tools".

3) Provide broken sample code

The WSP did provide a "soap_wrapper" module in perl that was meant to sit on top of SOAP::Lite. But it was broken in a way that it never could have worked for this particular application (it was hardcoded to use HTTP and wouldn't use HTTPS correctly). There were other problems that caused us to burn a lot of time tracking them down.

In fairness, this code was much better than trying to start from scratch - I'd have been completely lost - but the changes required were really pretty straightforward to somebody who knew the package and would have saved us a ton of time. This suggests they're either not using soap_wrapper themselves or are not distributing updated versions.

4) Make documents hard to get

When we ask for stuff via email (say, XSDs), they usually arrive quickly, but it took more than a week to get the WSDL. It would make so much more sense to have a password-protected "developer partner" page where they could park stuff: documentation, XSDs, WSDL, etc. Then an email saying "Heads up- we changed something, go [HERE] to get them". I can't imagine a partner not wanting the WSDL right away.

5) Change things without telling your partners

After we finally got past the several soap_wrapper options, the "send up a document" API call was failing with an "Invalid API" fault code. This fault code wasn't documented anywhere, and I was sure that it was my own doing. Six hours later I finally sent off an email with full traces asking if they can see what's up, and they responded immediately with "we changed things - use this other API call instead".

Doh!

What's more annoying is that their "fault" message has room for a description string, and it would have been so much more helpful to return "Obsolete method - try YYYY instead" in addition to the "Invalid API" fault code.. I find it hard to believe that I'm the only one who burned time on this, and I'm thankful I didn't start on this on Saturday morning.

To be fair, the WSP staff has been exceptionally friendly and eager to help, and I've really enjoyed working with them. In particular, the manager of the group has been great. My guess is that they're under deadlines like everybody else is, and since this is a new service introduction, it's likely that they're simply struggling to keep all the balls in the air.

It's also my guess that their other partners are using consultants with much more experience in XML/SOAP than I have (which wouldn't be that hard), so some of the issues that have stymied me would be mere speed bumps to them. It's not really fair to beat up a vendor for not catering to the least common denominator.

But dealing with your partners should be one of overflowing with information even if these partners are all under nondisclosure. That each partner has to go through the same difficulties makes for a lot of reinvented wheels.

Thankfully, I believe the hard part is behind me famous last words and perhaps I'll get enough experience to do more consulting on these lines.

It's been quite an adventure.

Posted by Steve at 01:08 PM

August 19, 2002

"Omit dashes and spaces"

I just found yet another web site that has a place to enter a credit card number but limits the input to exactly the length of a card, and has that stupid "No dashes or spaces" limitation. Spacing in credit card numbers (and phone numbers) is there for a reason, to give people a way to process information in small chunks, and it's much easier to verify that my card number is right when I can type in the spaces and compare it back that way.

I understand that web software has to be careful about users entering bogus or dangerous data into a web form, but spaces and dashes? Is there any good reason to forbid this entry on a web form? I can't think of one.

What a lousy user interface.

Posted by Steve at 06:39 PM

XML/SOAP is kicking my ass

I got thrown into a customer project involving e-commerce using XML and SOAP, and it's proven to be a real bear to work with.

XML has been described as "HTML on crack", and it allows the encoding of data in a much more self-descriptive format than regular HTML. Instead of just encoding the presentation of data, it encodes the meaning of the data:

<person> <firstname> Steve </firstname> <lastname> Friedl </lastname> <email> steve@nospam </email> </person>

XML has a more rigid syntax (no overlapping nesting, all tags must be closed, etc.), but it's tolerably readable by a person (though that's not really the main goal).

The tags actually required in an interchange can be defined by an XSD (a "schema"), and this says that a "person" has three elements: a "firstname", a "lastname", and an "email". It's straightforward enough to describe most hierarchical data structures. XSDs themselves are written in XML, and there are plenty of tools out there that will take an XML file and its associated XSD and validate the file against the schema.

Personally I use XML Spy Suite v4.4 , which is an excellent IDE for working with XML and related files. Their graphical schema design tools are really outstanding, as XSDs can be hard to "skim" to get the big-picture overview.

So far, so good: I've done this before, and generating XML from flat-file inputs in perl is really straightforward.

But then we step up to SOAP, which is the Simple Object Access Protocol, and it's essentially remote procedure calls over HTTP with XML used as the serializing mechanism instead of XDR. This is where it gets a lot trickier.

Let's say that I wish to offer a service that provides the current temperature in Tustin California via the web. Obviously a web page that updates now and then would work for a person sitting at a web browser, but to automate this is a bit more work. Clearly we could do "screen scraping" of the web page, but by using SOAP we can make the request look like a function call.

In practice, the remote client often sees it exactly this way (shown in pseudo-perl, drastically oversimplified):

my $service = new SOAPService("http://www.unixwiz.net/soap/");
my $temp = $service->getTemp("F");

Here, the SOAPService object knows how to package up the parameters ("F", for Fahrenheit) via XML, wrap everything in special SOAP wrappers and headers, make an HTTP connection to my web server and submit the request. My web server would decode the request and fetch the local temperature somehow. Then it serializes the answer and sends it back to the client where it's deserialized and presented back to the caller. A tremendous amount of work is done behind the scenes by that "getTemp" function call.

OK, so how do you describe the services being offered by a web service provider? One answer is "WSDL", Web Services Description Language. This is an XML file that describes each service being offered: the name, the parameter(s), the return value(s). This specification seems to be a little bit in flux, and unless one is really up to speed on all this technology, it is fairly dense reading.

I am using the SOAP::Lite toolkit in perl, which seems very full featured. The problem is that I keep getting "undefined" errors from deep within the SOAP::Lite package file itself, and at 5000 lines (I keep running into modules like this) it's just impenetrable. The code itself is only lightly documented internally.

I can create the XML files just fine, but talking to the web service provider has been no joy, so I figured I'll just try XML Spy's SOAP debugger. I load all the XSD files and create a SOAP session, and it barfs while parsing the vendor's WSDL file. There are bits and pieces that I'm able to hack my way through, but at some point the offending parts just look too important to rip out.

A seriously complicating factor is the the web services provider requires that larger XML documents be compressed and sent as attachments, and this isn't really provided for by SOAP::Lite due to the way MIME attachments are handled. I have a feeling that none of the other toolkit providers do either.

The vendor did provide a wrapper for SOAP::Lite that handles the attachments and the like, but since I can't even get it to work right in the non-attachment case, I'm pretty much stuck.

To put the cherry on top of my ice cream scoop of woe, I bought the O'Reilly book Programming Web Services with SOAP hoping it would help get me over my hump. Considering that they had specific coverage of Java and Perl, and that one of the authors is the fellow who wrote SOAP::Lite, I figured this would be a hit. Wrong.

The book gives a pretty decent high-level overview of the whole technology space, and the writing style is pretty good. But's it's very thin on getting into the real nitty-gritty of SOAP::Lite. What a disappointment.

So I'm sitting here completely over my head, with deadlines fast approaching, other projects delayed, and I can't even think about running my clock for all this burned time.

Why did I take this project?

Posted by Steve at 01:29 PM

August 18, 2002

Aztec

My webmistress suggested that I read Aztec, which is a "historical novel", and to my mind this is just one step above Harlequin - not interested. I had visions of reading "Johnny Tremain" in the fifth grade, which just didn't it do it for me. I've not read a historical novel in 30 years.

But she was persuasive, particularly in her characterizaton "depraved", so I gave it a try. I was captivated and at this point have read the two followons. Just stunningly good writing, and it surprised the hell out of me.

It's the story of Mixtli, a very colorful Aztec man as it recounts the story of his life, and the incredible richness of Aztec culture and technology is much, much more interesting than it sounds in a cold review. There is war, conquest, sex (lots of it), very bizarre human sacrifice, and all the other things that make a book interesting. This was a can't-put-it-down book, and it does not take long to get into.

Everybody I know who's read this loves it. Not for the delicate, but everybody else ought to love this book.

Wow.

Posted by Steve at 09:11 PM

Patching WGET for FTP dotfile fetches

A customer needed to fully mirror a web site via the FTP interface, and we found that the otherwise wonderful "wget" tool didn't know about dotfiles. This meant that the three gigabytes of data sucked down omitted the ".htaccess" and related files - getting them by hand was looking to be a serious annoyance.

So I modified wget-1.8.2 to enable the -. and --dotfiles cmdline parameters to enable dotfile fetches over FTP and have written up a patch for it. I believe this to be a "full-service" patch in that it does the docs and stuff too.

Tech Tip: Patching wget to fetch dot files over FTP

I won't submit this to the wget maintainers until I get some mileage on this patch: those having experiences with it are encouraged to comment here.

Posted by Steve at 12:26 AM

August 15, 2002

What's a "Pluot"?

Today while in the produce section of the local grocery store, I saw a fruit called a "pluot". They look like apricot-colored plums, and I had to get a few. They are very tasty, and a quick google search shows that they are complex cross of (imagine this) plums and apricots. There is the clear taste of both in the fruit, and it's well worth trying.

more info here

Posted by Steve at 10:24 AM

August 14, 2002

Ticketmaster sucks

Just bought a ticket to see The Young Dubliners, Seven Nations and Great Big Sea at the Anaheim (Disneyland) House of Blues for Friday night, and I'm amazed how Ticketmaster can stay in business with such crappy service. Their web server was constantly timing out, and even when it worked it was slow. It was exasperating. There was even a spelling error in the confirmation email - bozos. I wish they had some competition :-)

Oh, maybe I figured out how they stay in business: my single $20 ticket ended up costing $33.50 by the time the fees were added. Was surprised not to find a "fee-processing fee".

Thanks to Sharon for the great tip on these bands.

Posted by Steve at 03:58 PM

5000 Lines!

A customer was having troubles with some serial I/O under UNIX, and he asked me to look at it. Doing industrial-strength serial I/O in UNIX is somewhat of a black art, and my work with the VSI-FAX UNIX fax system has given me a bit more experience than most. There is no shame in not getting serial I/O right during your first at-bat, but this C++ module was 5000 lines long. In a single file. How on Earth do people think this way? Is there any good rationale for source files this large?

No, VSI/Esker is not the customer

Posted by Steve at 08:46 AM

August 13, 2002

Bizarre IM

I mostly hang out on Yahoo! IM, but occasionally jump onto AOL IM. The last time I did, I immediately got pinged by somebody I'd never heard of. It was such an odd conversation that I need to include it here:

A10warthog123: yo
A10warthog123: do u no how to hack?
StephenFriedl: hello
StephenFriedl: yes
A10warthog123: can u get aim passwords 4 me 4 nothing'
A10warthog123: ?
StephenFriedl: you've got to be kidding
StephenFriedl: sorry
A10warthog123: dammi5t
A10warthog123: dammit
StephenFriedl: do I know you?
A10warthog123: no
A10warthog123: u made a site
StephenFriedl: so I would hack for you because... ?
StephenFriedl: I made my own site.
A10warthog123: no shit
StephenFriedl: I'm pretty good at this
StephenFriedl: but I'm pretty sure I don't hack for strangers
A10warthog123: well ne way bye then
StephenFriedl: k

I've written some tools that may be popular with the script kiddies, but this seems pretty brazen.

Posted by Steve at 09:57 PM

Happy Birthday, Kasia

Well after Kasia set up my weblog and I did nothing with it for a while, I had to figure this out in time for her 29th birthday (Aug 14th). I'll see if I can't start adding stuff to this log now, but for now I'm happy to just bid my system admin a very happy birthday.

Posted by Steve at 09:49 PM