SimpleDB: non printable charachters

Posted by & filed under Work.

“this is my sweet string” == “dGhpcyBpcyBteSBzd2VldCBzdHJpbmcZ”?

Those two are actually equivalent, as proven by Amazon SimpleDB.  We started seeing these mysterious strings in our SimpleDB data which is supposed to be  a direct upload of SQL data for use in a UI.  I automatically assumed it had something to do with special characters and proper encoding, as we have seen in our processes before.  But this case was more unique because instead of just mangling the special character, it has managed to blow out the entire string… WTF

The culprit was an “End of medium” ASCII control charachter.  These ASCII control charachters are all non printable.  Once I had this figured out some more googling led me to the answer of why the whole string was unrecognizeable, base64 encoding:

What’s happening/changed: http://www.dibonafide.com/?p=25

The official documentation: http://docs.amazonwebservices.com/AmazonSimpleDB/latest/DeveloperGuide/index.html?InvalidCharacters.html

In my case the charachters were more of an anomaly and I just wanted to be rid of them.  I had some options of taking care of it during the upload to SimpleDB or at the DB level.  What worked easier was just throwing this handy function on my DB, and converting in the view I’m using to get data to upload:

http://iso30-sql.blogspot.com/2010/10/remove-non-printable-unicode-characters.html

Per the official documentation above the response object for the item will actually indicate what type of encoding it’s using.  A few sites have mentioned they just base64 encoded on upload, and then decoded when its being used for display.  I think with it specified in the item itself, you can probably just design your UI or whatever is consuming the data to check for base64 encoded strings, then decode, and remove any invalid characters there.

Now if I could just get WordPress to stop using invalid XML characters in its markup… that’s a topic for another day!

Usability: Degraded

Posted by & filed under Code, Rant.

Recently I’ve come across two cases where it appears in the interest of usability, we’ve gone too far and invaded what I thought used to be “boundaries” for UX, and proper accessibility practice.

1. Google Hijacking My Down Arrow

Have you ever gone to a google search result page and tried to use your down arrow on the keyboard to bump down your scroll?  AND NOTHING HAPPENS?  They’ve hijacked your down arrow and given you what’s been affectionately named (by the internets) “little blue arrow” which is a cursor for each search result.  Each press of your down arrow will move this cursor to the next result.  Before: One keypress of down arrow would show roughly 10-20% more page “below the fold”. Now: I have to press my down arrow as many times as there are results showing “above the fold” to see more of the page…

I don’t know if this is some kind of conspiracy to increase views of top ranking search terms or just a bad attempt to “improve” usability, I’d rather it be the former in my book.

http://www.google.com/#sclient=psy&hl=en&source=hp&q=google+hijacking+down+arrow&pbx=1&oq=google+hijacking+down+arrow&aq=f&aqi=q-n1&aql=&gs_sm=e&gs_upl=2001l6244l0l6343l30l18l1l4l4l0l267l1988l4.5.4l13l0&bav=on.2,or.r_gc.r_pw.&fp=996a5a2e91f3c54d&biw=1086&bih=723

2. Sites are copy-jacking me

Yes it appears the current definition of “copyjacking” has something to do with copyright stuff, but that’s stupid.  So I’m on a news article, and I copy a line of text:

The vote had no legal bearing on the state’s mega-project

I paste it in order to tweet something and this is what I get:

The vote had no legal bearing on the state’s mega-project
Read more: http://www.seattlepi.com/local/transportation/article/On-Ref-1-Seattle-says-build-the-tunnel-2076294.php#ixzz1VIzuHvAE

Not just text that I didn’t copy, but its got two whole line breaks, blowing up my little tweet window…  Even worse than Seattle City Council wasting our time on votes and referendums that have “no legal bearing”

Based on my scientific Google research the practice seems fairly new, and the culprit on the site I was viewing was Tynt, as noted in the post below:

http://stackoverflow.com/questions/1203082/injecting-text-when-content-is-copied-from-web-page

I haven’t had time to crack open the js, but if they were also tracking what I copied it would be a great opportunity for a little #writealetter hacking to tell them what I think.  Maybe this atrocity will warrant general analytics-spamming, we’ll see who else is on my list.

Future of King County Metro – get free bus fare!

Posted by & filed under Rant.

The new $20 car licensing fee will allegedly save massive cuts in bus service.  The free-ride zone is also being axed, but a new perk also popped up as I was reading this article: http://blog.seattlepi.com/transportation/2011/08/23/is-killing-the-ride-free-area-an-attack-on-the-poor/

Metro plans to hand out eight free bus vouchers when vehicles are licensed under a “Transit Incentive Program.”

Say what?  I assumed it was some green-inspired wage-against-car-esque initiative that I most certainly could take advantage of as a taxpayer, sure enough: http://metrofutureblog.wordpress.com/2011/08/12/congestion-reduction-charge-agreement-part-1-what-is-the-transit-incentive-program/

But of course, if you already ride the bus, or won’t take advantage, they hint that you should:

Alternatively, car owners will have the option to donate the value of the tickets to a program that supports low-income residents who depend on transit to access services in their communities.

No thanks, I’ll take my free $20 in bus fare.  Lets consider it an offset for the famed “Discover Pass” and it’s extra fees for buying it from a vendor (http://blog.thenewstribune.com/politics/2011/08/18/rep-j-t-wilcox-slams-30-parks-fee/), as well as the $5 for state parks Washington tries to automatically throw in.

And on an editorial note about getting rid of the ride free area.  This is a great thing for the bus tunnel.  Since light rail came through and now keeps it open late into the night, security is an obvious issue.  After visiting San Francisco, Washington DC and Philadelphia you realize you have to pay before you enter any transit “pavillion”.  I hope we take this step with the bus tunnel, put some pay boxes at street level, throw some turnstyles in:

  • Instant security
  • far less fare-evasion
  • increase efficiency in the tunnel by reducing people who don’t know what they’re doing when they get on the bus.

Human Cookie?: The Blockshopper Scam

Posted by & filed under Rant, Social Media, Write a Letter.

This is something that came up not too long after I may or may not have bought a house a while ago.  Thought it was worth posting as there are still a large number of frustrated people on the internet.  When googling myself I found an interesting “article” about a home purchase titled:

“Media technology specialist buys in Maple Leaf”

Now, I am in the “Internet” industry, so I can appreciate a good service, hack, etc…  This one is obviously interesting and clever: data mining from public records, and joining to the “public” internet.  This guy would probably have made a shit ton more money if he focused on something other than such a sensitive topic as real estate, home purchases, and public records.  Instead he spent his time building a public-records scraper, that then joins to LinkedIn — and that’s it as far as I can tell.  Hopefully he doesn’t pay actual people to make the connections and write the stories… god forbid.

It was a pretty blatant connection as most of these “stories” have a direct link to the person’s LinkedIn profile.  It was very comprehensive as well, listing a current job, prior positions, previous employment and education information – everything that might have been on a LinkedIn profile.

The Catch

Long story short, I don’t like other people joining my public information together for others; that’s what I do to you when you piss me off – It’s an art.

After scouring through consumer complaints and stories of how to get it removed I found nothing concrete.  Then it dawned on me: They have NO right to assume that the same named person on LinkedIn was the same named person that bought the house! I work with data all day long, so I felt accomplished figuring this out: I call it an “erroneous join”.

Read more »

ADD NEW COLUMN TO GOOGLE SPREADSHEET

Posted by & filed under Code, Java, Note To Self.

Yes, ALL CAPS, here is EXACTLY what I want to do:

Add new column to google spreadsheet using java api

Maybe now that term will be google-able? As with everything handy, it’s impossible to find what you need in the documentation, or on the broader internets. The fun part about google spreadsheets via the API is that its a lot like Amazon SimpleDB, since your “row” schema will be as large as your widest row. Because the only objects you have to work with are “List Feeds” (rows) and “Cell Feeds” (cells) there isn’t much of a concept of “Columns”, except in the ListFeed as the entry.getCustomElements().getTags() collection.

Bottom line, all I wanted was to see if a column existed or not, and if not, add it to the sheet. I haven’t seen any way to do it while in the ListFeed, so its apparent it has to be done from the CellFeed. You’ll have to do some ListFeed-ing, then add it in the CellFeed, then you can go back to your ListFeed and hopefully access your new column.

This post has the solution:
http://stackoverflow.com/questions/4348610/how-do-i-create-the-first-line-in-a-new-google-spreadsheet-using-the-api

As the only useable google result from “Add a column”, its about adding columns to a blank sheet – also very useable and solved my problem. So, that should fix the Google searching for them. On to my problem:

I have:

  • 4 mandatory/default columns in my sheet to start
  • a configurable amount of additional “template” columns I want to populate.

I want:

  • Update data in my additional columns on a regular basis using the data in the 4 initial columns.
  • To take my configurable list of additional columns and compare it to the columns that currently exist:
    • Check each time for my additional columns when I start updating
    • Add the additional columns if they dont exist

Read more »

OLAP: Make XML(A) work for you

Posted by & filed under Code, Note To Self, Rant.

The bread and butter at work is all the magic that happens in OLAP, Microsoft Analysis Services in particular. Alot of people assume OLAP is going the way of the Buffalo, they probably also like buzzwords such as “nosql”, “bigdata”. I won’t argue that OLAP has a pretty high barrier to entry, as well as limited resources compared to a lot of other data technologies. And, you probably shouldn’t paint yourself into a corner with it and/or put all your eggs in one basket. I say shouldn’t because its pretty hard to avoid in BI solutions, and most places probably end up like us – asking the question of “how do I turn this into a cube?”.

My main complaint is that people often treat OLAP, especially SSAS as a giant “black box”. If it’s not in/done/accessible in the (Visual Studio) solution then its a dead end… It only takes a second to realize every solution file is XML, and that every item in a Cube or OLAP db can be expressed in XML (from management studio) -> turns out the XML for objects in the solution is about the same as when looked at in the deployed cube/olap db. What does this mean? VS is just a giant XML editor… Not that we’re surprised.

I was forced to learn this via necessity, mostly for the need to integrate Microsoft Analysis Services into an open source ETL tool. First it was just figuring out simple processing and synching, then into dynamic cube modification. I had to rely on the ascmd.exe command line utility that was part of the Developer Toolkit as a huge crutch, but I didn’t have to use SSIS anymore. Maybe I’ll go into more detail some day, when someone else cares.

The holy grail for shops tied to SSAS is integrating or decoupling as much as possible from the MS stack, into open source technology. All while maintaining the sweet XMLA/MDX support for things like ADOMD etc…

In some recent play projects at work we’ve used SQLite as a DB, and I began searching for possible open source OLAP projects that would allow me to build a “cube” on it. I found Cubulus:
http://cubulus.sourceforge.net/
http://sourceforge.net/projects/cubulus/
It’s pretty cool, built in Python, but its built with a drill-down UI tied right on top of the MDX parser. Probably the best starting point, but there’s still a lot of fun left to try and open it up to parse XMLA requests.

Obviously for enterprise-scale items there’s Mondrian, which everyone seems pretty shy of so far.
http://mondrian.pentaho.com/

However, CHECK THIS GNAR OUT (Xmla4js):
http://wiki.pentaho.com/display/COM/January+13,+2010+-+Roland+Bouman+-+OLAP+and+Analysis+for+web+applications+using+XMLA

It’s a javascript library for XMLA. Do I need to say more? I haven’t looked into it very far yet, but how could it NOT be badass.

Neighborhood Blogs: helicopter sightings

Posted by & filed under Blogs, Rant, Social Media, Twitter.

Have you found your local neighborhood blog yet?  They’re all the rage, informative, community-oriented, and often quite amusing.  One of the frequent “hot topics” is:

WTF Just flew over my house!?

Pretty much any low-flying object (including exotic birds) will warrant FULL coverage on your local blog.  I decided to scientifically collect some data on the keywords “loud” and “helicopter” specifically.  (using google site: specific search):

West Seattle Blog (412)(1,250) http://goo.gl/GqNPT

[edit]WS Blog’s Twitter thingy inflated results due to the mention in the Roosiehood tweet about the ‘copters… Advanced search to the rescue, still winners by a longshot[/edit]

My Ballard (67) http://goo.gl/XRx39

Capitol Hill (29) http://goo.gl/fM1Yy

Roosiehood (2): http://goo.gl/f7ZHI

No big surprise there, West Seattleites take the cake…  I wondered about this after a late-night ‘hood flyover of some millitary choppers. Thanks to the Twitter @RavennaBlog via Roosiehood, of course there was a morning follow up…

http://www.roosiehood.com/2011/05/31/ask-roosiehood-what-were-those-loud-helicopters-over-roosevelt-tonight/

And I think the Ravenna Blog deserves a special shout out for what I would deem proper “Social Media Hierarchy Waterfall” implementation.  They know the difference between what deserves an entire blog post vs what “belongs” on Twitter…

Internet house in order… slowly

Posted by & filed under Uncategorized.

Public facing, possibly updated, site for me is back.
Got tired of:

  • Running my own development server
  • Janky wordpress installs on Windows Shared Hosting
  • The thought of having to create mutliple WordPress installs for each of my possible internet properties…

So I:

Next:

  • Figure out broadcast/syndication across WP Multi-site (thought thats what I was getting!) (plugins to the rescue)
  • Make more blogs
  • All kinds of Twitter integration
  • Get $$$