Why you should work at Burson-Marsteller on my team

The job app for an opening on my team went up a few days ago (March 2017), here: http://chj.tbe.taleo.net/chj05/ats/careers/requisition.jsp?org=BM&cws=1&rid=1694

It’s in downtown DC, close to a Metro stop. We plan to find a mid-career developer and pay mid-career developer rates.

What’s are we doing?
The job is mostly backend development, Ubuntu-Nginx-Postgres-Django. So UNPD or DUNP or UPDN or something like that. It’s a hybrid of data work, application development, data science and a cloud component. We use AWS with a variety of server and service types. Developers are learning more about how to do server admin in order to deal with AWS and server administrators are learning more about application development, a phenomena called DevOps*. (*because I say so. aficionados think DevOps is a cultural shift in IT departments, too.) We are down to one on-premise server and that’s the way we like it.

Whats the team look like?
We are doing what I call sincere scrum. You may be more familiar with the “yeah-right” software development process, where someone dictates SDLC and the team and management say “yeah right, we’ll do that”. Part of getting team members autonomy is having some clarity around roles. At the moment, we have a clear Product Owner (my supervisor), a Scrum Master (myself), a server administrator and a front end developer.

So what the heck is a scrum master? By the agile textbook, I’m in charge of facilitating the daily status reports, curating the ticket database, but I don’t primarily choose what features we implement. The product owner decides what features are going to make the team money.

Scrum (and especially ‘Extreme Programming’) have some opinions about practices, such as the need for unit testing, build servers, source control etc. I’m also the build master, which means, we have a build server & a build script that automatically compiles, lints, and runs tests. If you haven’t worked on an application with a suite of unit tests, it makes maintenance development far less stressful. New bugs and breaks are found fast and can be fixed fast. Build script driven development also means when it comes time to do testing with live humans or code review with live humans, time is spent on higher level concepts instead of nuisance bugs.

The team is distributed between DC and NYC. We don’t mind if team members work a day a week at home.

What are the cool technologies?
On top of a cake of data work and application development, for frosting we get to work with social media APIs, a huge multinode Redshift database, REDIS, data science libraries like scikit-learn, pandas and so on. Many of our app’s components do huge data pulls that feed datascience reports. The cost of data and the cost of statistical prediction has come way down in the last few years and we are exploiting that to do AI-like or magic-like predictions and analysis.

So throw your resume into the pile and our HR department will get to me:

NB: I don’t speak for Burson-Marsteller. I just happen to work there and we have a job opening on our team.

Adapting a template for a resume

For various reasons, I’ve become way to familiar with the technologies associated with creating resumes. For the record, the coolest are: CV Maker, LaTeX resume templates, JsonResume and just about anyone’s HTML5 resume template. The HTML5 templates are written by people who actually have artistic taste, so they look beautiful. No way could I do the same in a short time, so I bought a template. (Never mind that the Template store’s UI let me buy the wrong template before I bought the right one, let’s focus on the happy parts of this experience.)

To use it for my self, I had to:

Assemble the raw material for a resume. StackOverflow Careers is my ground truth for resume data. From there I copy it to USAJobs and so on.

Load it up in Intellij Visual Studio with Resharper is not too bad, but if you just use Intellij, you get all the goodness that Resharper was giving you and more.

Disable the PHP mailer A contact form is just a spam channel. Don’t ask me why spammers think they can make money sending mail to web admins (unless maybe it’s spear phishing). I considered not showing my email address, but the spam harvesters already have my email address and google already perfectly filters my spam.

Strip out the boiler plate. Every time you think to got it all, there is more references to John Doe.

Fix the load image. The load image was waiting for all assets to render before it would remove the annoying spinner and curtain. But the page did not have any elements that the user might interact with too early. The page didn’t have any flashes of unstyled content like you see with Angular. There weren’t any UI element suddenly snapping into place on the document ready event like a certain app I’ve worked on before.

Deminimize the code. This should be easy, right? A template has to ship with the JavaScript source code. But the code was minimized. So I pointed the page to the non-minimized version. The whole page broke. Finally I noticed the minimized file actually contained version 1.2 and the non-minimized shipped was 1.0. So I deminimized the code and could begin removing the extraneous load image.

Upload to my hosted Virtual Machine . Filezilla all of a sudden decides it can connect, but can’t do anything. Some minutes later, I figure out that TunnelBear, my VPN utility and Filezilla don’t play well together. So I added an exception for my wakayos.com domain.

Write blog post. I just wanted my resume to be a nice container. But as a developer, it sort of looks like maybe I wrote this from scratch. I certainly did not.

How to do postbuild tasks painlessly in Visual Studio

So, you’d think you’d need to write custom msbuild targets or even learn msbuild syntax. Wrong!

(okay, some of this can be done via the project properties, the “Build Events” tab. The “Build Events” tab hides the fact that you are writing msbuild code and now you have batch file code embedded into a visual property page instead a file that can be check into source control, diffed or run independently.)

You will have to edit your csproj file.

1. Create a postbuild.bat file in the root of your project. Right click for properties and set to “always copy”. A copy of this will now always be put in the bin\DEBUG or bin\RELEASE folder after each build.

2. Unload your csproj file (right click, “Unload project”), right click again to bring it up for edit.

3. At the very end, find this:
<Target Name="AfterBuild">
<Exec Command="CALL postbuild.bat $(OutputPath)" />

The code passes the \bin\DEBUG or \bin\RELEASE path to the batch file. You could pass more msbuild specific variables if you need to.

Strangely, the build output window will always report an error regarding the first line of the batch file. It will report that SET or ECHO or DIR or whatever isn’t a recognized command. But the 2nd and subsequent lines of the batch file run just fine.

From here you can now call out to powershell, bash, or do what batch files do.

What a Build Master should know or do as a Build Master.

An automated build is beautiful. It takes one click to run. The clicking on the run button is completely deskilled. It can be delegated to the cat.

Setting up and troubleshooting a build server on the other hand is unavoidably senior developer work, but I encourage everyone to start as soon as they can stomach the complexity.

Does it compile?

A good build pays close attention to the build options as a production release will have different options from your workstation build. If it builds on your machine, you may still have accidentally sabotages performance and security on the production server. Review all the compilation options.

In the case of C#, the rest of the build script is a csproj file, which is msbuild code, which is executable xml. You don’t need to know about how it works until stuff breaks and then you need to know enough msbuild to fix it. Also, because the build server sometimes doesn’t or can’t have the IDE on it, the msbuild script might be the only way to modify how the build is done.

The TFS build profile is written in XAML, which again is executable XML. Sometimes it has to be edited, for example if you want to get TFS to support anything but ms-test. Fear the day you need to.

Technologies to know: msbuild, IDEs (VS20XX), TFS GUI, maybe XAML, possibly JS and CSS compilers like TypeScript and SASS

Is it fetching source code correctly, can it compile immediately after check out to a clean folder?

When there are 50 manual steps to be done after checking code out before you can compile, the build master must fix all of these. Again, it builds on the workstation, but all that proves is that you have a possibly non-repeatable build.

Maybe 90% of the headaches have to do with libraries, or nowadays, repo managers, like nuget, bower, npm, etc. A sloppy project makes no effort to put dependencies into source control and crappy tools means the build server or build script is unawares of the library managers.

Technologies to know: tfs, nuget, bower, npm, your IDE

What is “good” as far as a build goes?

A good build server is opinionated and doesn’t ship what ever successfully writes to a hard drive. Depending on the technology, there isn’t even such a thing as compilation. Those technologies have to be validated using lint, unit tests and so on. These post build steps can either be failing or non-failing post-build tasks. If they don’t fail the build, then often they are just ignored. Failing unit tests should fail a build. Other failing tasks, probably should fail a build, even if they aren’t production artifacts. I usually wish I could fail a build on lint problems, but depending on the linter and the specific problems, sometimes there just isn’t enough time to address (literally) 1,000,000 lint warnings.

Technologies to know: mstest, nunit, xunit, and other unit test frameworks for each technology in your stack.

Who fixes the failing tests? Who fixes the bugs?

The build master, depending on the organization and how dysfunctional it is, is either completely or partially responsible for fixing the build. There is no way to write a manual for how to do this. Essentially, as a build master, you have to dig into someone else’s code and demonstrate they broke the build and are obliged to fix it, or quietly fix it, or what ever the team culture allows you to do.

Technologies to know: nunit test, debugging, trace

We got a good build, now what? Process.

Not so fast! Depending on the larger organization policies with respect to command and control, you may need to get a long list of sign offs from people before you can deploy to the next environment. Sometimes you can have the build serve deploy directly to the relevant environment, sometimes it spits out a zipped package to be consumed by some sort of deployment script. Usually though, the build server can’t deploy directly to production due to air gaps or cultural barriers.

Technologies to know: Jira or what ever issue tracker is being used.
Non-technologies to know: your organizations official and informal laws, rules and customs regarding deployment.

The Grand Council of Release Poobahs and your boss said okay, now what?
This step is often the most resistant to automation. It often has unknowable steps, like filling in the production password, production file paths and IP addresses.

MsBuild supports no less than two xml transformation syntaxes for changing xml config for each environment.

It may be advisable for environments you know about to do enviornment discovery. It’s either wonderful or an easy way to shoot yourself in the foot. When you know the target server is a Windows2008 Server and on such servers it must do X and on Win 7 workstations it must do Y, don’t forget to think about the Windows 10 machine that didn’t exist when you wrote your environment discovery code. Maybe it should blow up on an unknown machine, maybe it should

Technologies to know: batch, powershell, msdeploy, MS Word
Non-technologies to know: your organizations official and informal laws, rules and customs regarding deployment.

Optional Stuff

Build servers like TFS also have built into them bug trackers, requirements databases, SSRS, SSAS (Analysis Services), and build farm things. They are all optional and each one is a huge skill. SSAS alone requires the implanting of a supplemental brain so you can read and write MDX queries.

Also, optional, is learning how other build servers work. No single build server has won all organizations, so you will eventually come across TeamCity, Lunt Build, etc.

Features worth searching for:
Log-by-level. E.g. info, warn, verbose, error.
Log-by-module/theme. E.g. MyClass, file1.js, Data, UI, Validation, etc. Sometimes called “groups” or other things.
Log-errors. Info, warn, verbose are all the same data type, but the error is a complex data type and the work flow differs dramatically from the others.
Log-to-different places, e.g. HTML, alert, console, Ajax/Server
Log-formatting. E.g. xml, json, CSV, etc.
Log-correlation. E.g. if you log to 2 places, say a client, server and web-service and db, and a transactions passes through all four, can you correlate the log entries?
Log-analysis. E.g. if you generate *a lot* of log entries, something to search/summarize them would be nice.
Semantic-logging. E.g. logging (arbitrary) structured data as well as strings or a fixed set of fields.
Analytics. Page hit counters. (I didn’t search for these)
Feature Usage. Same, but for applications where feature != page
Console. Sometimes as a place to spew log entries, sometimes as a place for interactive execution of code.

Repository Queries:
http://bower.io/search/?q=logging 25 entries right now.

https://www.nuget.org/packages?q=javascript+logging Nuget JS logging libraries.

https://nodejsmodules.org/tags/logging – Nodejs Repository query

Various Uncategorized Browser-Centric Libraries
https://github.com/enoex/Bragi-Browser category logger (here they are called “groups”)
https://github.com/better-js-logging/console-logger category & level logger.
https://github.com/latentflip/bows colorful logging by category
https://oaxoa.github.io/Conzole/ Side console.
https://github.com/structured-log/structured-log – Serilog/Structured Log
http://log4javascript.org/ Log4JavaScript – for people who like the log4x API. As of 2015, appears dated & unmaintained.
http://www.softwarementors.com/log4js-ext/ More upto date log4x library. Fancy on screen log.
https://github.com/stritti/log4js A level-logger. (Many features are IE only)
http://smalljs.org/logging/debug/ Console logger with module filters.
http://jsnlog.com/ Client side logger that sends events to popular server side logging libraries (more than just server side node)
http://www.songho.ca/misc/logger/logger.html on screen (HTML) logging overlay
https://github.com/jbail/lumberjack Monkeypatches the built-in console object.

Microlibraries for Browser
These might not be any smaller than other libraries.
https://github.com/bfattori/LogJS – supports local storage logging.
https://github.com/kapilkaisare/sleeper-agent 4 level logging with an on/off switch at runtime.
https://github.com/mattkanwisher/driftwood.js – 4 level logging with environment switches & ajax
https://github.com/pimterry/loglevel. 4 level logging Supports plugins.
https://github.com/cowboy/javascript-debug/tree/v0.4 console.log wrapper
http://js.jsnlog.com/ Same as JSN Log, but just the JS part, so it’s like a microlibrary.
https://github.com/nhnb/console-log – Polymer/web-component style console logging
https://github.com/icodeforlove/Console.js – Polyfill for pretty-console display?
https://github.com/Couto/groundskeeper – Build step to remove console.log entries before sending to production.

https://github.com/pockata/blackbird-js -Abandoned? Not sure what it does.
NitobiBug -Abandoned. If you look long enough you can find websites that serve up the file.

Browser plug ins
http://getfirebug.com/logging – Firefox centric.

Error Logging
Error logging is more than a print statement. Generally, at point of error you want to capture all the information that the runtime provides.
http://www.stacktracejs.com/ Stacktrace and more.
http://openmymind.net/2012/4/4/You-Really-Should-Log-Client-Side-Error/ Roll your own

Node Loggers (might work in Browser, not sure)
https://github.com/enoex/Bragi-Node (Node Centric)
https://www.npmjs.com/package/winston (Node Centric)
https://github.com/jstrace/jstrace (Node centric)
https://nodejs.org/en/blog/module/service-logging-in-json-with-bunyan/ Bunyan

Commercial Loggers
Often a opensource client that talks to a commercial server. No idea if these can work w/o the server component.





Dumb Services – Accessing a website when there isn’t an webservice API

The dumbest API is a C# WebClient that returns a string. This works on websites that haven’t exposed an asmx, svc or other “service” technology.

What are some speed bumps this presents to other developers, who might want to use your website as an API? The assumption here is that there is no coordination between

All websites are REST level X.
Just by the fact that the site works with web browsers and communicates over HTTP, at least some part of the HTTP protocol is being used. Usually only GET and POST, and the server returns a nearly opaque body. By that I mean, the mime type lets you know that it is HTML, but from the document alone, or even from crawling the whole website, you won’t necessarily programmatically discover what you can do with website. Furthermore, the HttpStatus codes are probably being abused or ignored, resources are inserted and deleted on POST, headers are probably ignored and so on.

Discovery of the API == Website Crawling.
If you could discover the API, then you could machine generate a strongly typed API, or at least at run time, provide meta data about the API. With a regular HTML page, it will have links and forms. The links are a sort of API. You can craw the website and find all the published URLs, and infer from their structure what the API might be. The Url might be a fancy “choppable” Url with parameters between /’s or it might be an old school QueryString with parameters as key value pairs after the ?.

You can similarly discover the forms by crawling the website. Forms at least will let you know all the expected parameters and a little bit about their data types and expected ranges.

If the website is JavaScript driven, all bets are off unless you can automate a headless browser. For a single page application (SPA), your GET returns a static template and a lot of JavaScript files. The static template doesn’t necessarily have the links or forms, or if it does, they are not necessarily filled in with anything yet. On the otherhand, if a website is an SPA, it probably has a real web service API.

Remote Procedure Invocation
Each URL represents an end point. The trivial invocations are the GETs. Invocation is a matter of crafting a URL, sending the HTTP GET and deciding what to do with the response (see below.)

The Action URLs of the forms. The Forms tell you more explicitly what the possibly parameters and data types are.

Data Serialization.
The dumbest way to handle the response from a GET or POST is a string. It is entirely up to the client to figure out what to do with the string. The parsing strategy will depend on the particular scenario. Maybe you are looking for a particular substring. Maybe you are looking for all the numbers. Maybe you are looking for the 3rd date. There in the worst case scenario, there is nothing a dumb service client writer can do to help.

The next dumb way to handle a dumb service response is to parse it as HTML or XML, for example with Html Agility Pack, a C# library that turns reasonable HTML into clean XML. This buys you less that you might imagine. If you have an XML document with say, Customer, Order, Order Line and Total elements, you could almost machine convert this document into an XSD and corresponding C# classes which can be consumed conveniently by the dumb service client. But in practice, you get an endless nest of Span, Div and layout Table elements. This might make string parsing look attractive in comparison. Machine XML to CS converters, like xsd.exe, have no idea what to do with an HTML document.

The next dumb way is to just extract the tables and forms. This would work if tables are being used as intended- a way to display data. The rows could then be treated as typed classes.

The next dumb way is to look for microformats. Microformats are annotated HTML snippets that have class attributes that semantically define HTML elements as being consumable data. It is a beautiful idea with very little adoption. The HTML designer works to make a website look good, not to make life easy for Dumb Services. If anyone cared about the user experience of a software developer using a site as a programmable API, they would have provided a proper REST API. It is also imaginable to attempt to detect accidental microformats, for example, if the page is a mess of spans with classes that happen to be semantic, such as “customer”, “phone”, “address”. Without knowing which elements are semantic, the resulting API would be polluted with spurious “green”, “sub-title” and other layout oriented tags.

The last dumb way I can think of is HTML 5 semantic tags. If the invocation returns documents, like letters and newspaper articles, then the elements header, footer, section, nav, or article could be used. The world of possible problem domains is huge, though. If you are at a CMS website and want to process documents, this would help. If you are at a travel website and want to see the latest Amtrak discounts, then this won’t help. I imagine 95% of possible use cases don’t include readable documents an important entity. Another super narrow class of elements would be dd, dl, and dt, which are used for dictionary and glossary definitions.

Can there be a Dumb Services Client Generator?
By that, I mean, how much of the above work could be done by a library? This SO question suggests that up to now, most people are doing dumb services in an ad hoc fashion, except for the HTML parsing.

  • The web crawling part: entirely automatable. Discovering all the GETs, and Forms is easy.
  • The meta-data inference part: Infering the templates for GET is hard, inferring the meta data for a form request is easy.
  • The Invocation part is easy.
  • The Deserialization part: Incredibly hard. Only a few scenarios are easy. At best, a library could give the developer a starting point.

What would a proxy client look like? The loosely typed one would for example, return a list of Urls and strings, and execute requests, again returning some weakly typed data structure, such as string, Stream, XML as if all signatures where:

string ExecuteGet(Uri url, string querystring)
Stream ExecuteGet(Uri url, string querystring)
XmlDocument ExecuteGet(Uri url, string querystring)

In practice we’d rather something like this:

Customer ExecuteGet(string url, int customerId)

At best, a library could provide a base class that would allow a developer to write a strongly typed client over the top of that.

Using Twitter more effectively as a software developer

FYI: I’m not a technical recruiter. I’m just a software developer.

Have a clear goal Is this to network with every last person in the world who knows about, say, Windows Identify Foundation? Or to make sure you have some professional contacts when your contract ends? Don’t follow people that can’t help you with that goal. If you have mixed goals, open a different account.

Important Career Moments Relevant to Twitter. Arriving town, leaving town and changing jobs, conferences, starting a new company– if you have a curated twitter list, it might help at those time points, or it might not, who knows.

At the moment, there are so many jobs for developers and so few jobs, that the real issue is not finding a job, but finding a job that you like. Another issue is taking control of the job hunting process. The head hunters most eager to hire you, have characteristics like, they make lots of calls per day and they have a smooth hiring pipeline. But there is no particular correlation with what sort of project manager is at the other end of that pipeline.

Goals: Helping Good Jobs Find Developers I’m talking about that day when your boss says, hey, do you know any software developers? And I say, no, I work in a cubicle where I talk to same 3 people 20 minutes a week. So that was a big part of my goal for creating a twitter following, so that in 3 years, bam, I can say, “Anyone want a job?” and it wouldn’t be just a message in the bottle dropped in the Atlantic. If you don’t care about the job don’t post it. If a colleague desperately needs to fill a spot for the worlds worst place to work, don’t post it, you’re not a recruiter, you got standards.

Twitter is a lousy place for identifying who is a developer and who is in a geographic region. After exhaustive search, I found less than 2000 people in DC who do something related to software development and of those, maybe 50% are active accounts. There must be more developers and related professions then that in DC– I guess 10,000 or 20,000.

Making Content: Questions. It works for newbie questions. Anything that might require an answer in depth is better on StackOverflow. And StackOverflow doesn’t want your easy questions anyhow.

Making Content: Discussion. It works for mini-discussions, of maybe 3-4 exchanges, tops. Consider doing a thoughtful question a day. Hash tag it, but don’t pick stupid hash tags, or hash tag spam. #dctech is better than #guesswhat Consider searching a hash tag before using it. Re-use good hash tags as much as possible to increase discussion around a hashtag.

Making Content: Jokes. It works really well for jokes. Now if you actually engage in jokes, that is a personal decision. They are somewhat risky. On the otherhand, if you never tells a joke, you’re a boring person who gets unfollowed and moved to a list.

Making Content: Calls to Action. I don’t practice this well myself because it’s hard to do in twitter. Most effective calls to action are some sort of “click this link”, hopefully because after I read the target page, I don’t just chuckle or say, “hmm”, but I do something different in the real world.

Making Content: Don’t do click bait. Not because it isn’t effective, it is effective in making people click. But everyone is doing it and it is junking up news feeds.

Building a Community: Who to Follow? Follow people you wish worked at your office. They may or may not post the content you like, but you can generally fix that by turning off retweets. If they still tweet primarily about stamp collecting, or tweet too much, put them on a list, especially if they don’t follow you back anyhow.

Building a Community: Finding people to Follow Twitter’s own search works best– search for keyword, limit to “people near me” and click “all” content.

Real people follow real accounts, usually. Real people are followed by 50/50 spambots and real people. Unfortunately, people follow stamp collecting and cat photo accounts, but are followed by friends, family and coworkers. If you are looking for industry networking opportunities, you care about the coworkers, not the stamp collecting and cat photo accounts.

Bio’s on twitter suck. People fill them with poorly thought out junk. I don’t care who you speak for, I don’t care if your retweets are endorsements. Put the funny joke in an ephemeral tweet, not the bio, followers end up re-reading your bio over and over. Include where you live, your job title and key works for what technologies you care about. Well, that’s what I wish people would do, but if you really want to put paranoid legal mumbo jumbo there, at least make sure that it aligns with your goals.

Building a Community: Getting Follow Backs. People follow back on initial follow, and sometimes on favorite and retweet.

Building a Community: Follow “dead” accounts anyhow. They might come back to life because you followed them. Who knows? It’s a numbers game.

Interaction: Retweet or Favorite? Favorite, means, “I hear you”, “I read that”, “I am paying attention to you”. Retweet means, “I think everyone of my followers really cares about this as much as they care about me.” People get this wrong so much I generally turn of retweet on every account I follow. I can still see those retweets should an account be on a list I curate.

Retweet what everyone can agree on, Favorite religion and politics. If someone says something you like, it’s a good time for engagement. But not if it means reminding everyone that follows you that after work hours, you are a Republican, Democrat or Libertarian. Favorites are comparatively discreet, the audience has to seek them out to find our what petition you favorited.

In practice, people Retweet when they should Favorite, junking up their followers news feeds with stamp collecting, radical politics, and personal conversations.

Interaction: Do start tweets targeted at one person with the @handle. It prevents that message from showing up in your followers feeds. Don’t automatically put the period in front, most people are gauging wrong when to thwart the build in filter system.

Know Your Audience. I have two audience, my intended audience of software developers in greater DC, and my unintended audience people who follow me because they agree with my politics, or are interested in the same technologies as me. I have a clear goal, so I know that the audience I’m going to cater to is the one that aligns with my goals. I can’t please everyone and if I wanted to, I would open a 2nd account.

Lists: Lists are for you. Don’t curate a list with the assumption that anyone cares. They don’t. Consider making lists private if you don’t think the account cares if they’ve been put on a list.

Lists: Create an Audience List The people I follow are great, but the people that follow me back are better. I put them on a private audience list because they don’t need a notification hearing that I’ve put them on an audience list.

People on my general list that don’t follow me back, I hope they will follow me back someday. The people on the audience list, I care about their retweets and tweets more because it’s just much more likely that I’ll get an interaction someday.

Lists: Create a High Volume Tweeter/”Celebrity” list. People who tweet nonstop junk up your feed, move them to a list unless they are following you back. “Celebrities” have 10,000s of followers but only a few people they follow. They probably won’t ever interact with you, but if they do, it will be via you mentioning them, not through a reciprocal follow relationship.

Things that went wrong trying to run pydoc

This is from the standpoint of a beginning python developer on windows. I assume you are like all other python developers and just stare at the machine to twiddle the electrons. These are my in progress notes on getting pydoc to run.

Install python. Install it over and over because it is just like Java, and .NET, dozens of side by side installs.
Install pip. It may or may not already be there.
Install pywin. I have no idea if it helps. My generators only generated for 3.3.
Install virtual environment so you don’t junk up the global interpreter with packages.

You will want to get c:\python33\c:\python33\scripts in the path. Set the path the old fashioned way. If using powershell, reload the environment over and over. (pywin is supposed to help for this)

$env:Path = [System.Environment]::GetEnvironmentVariable(“Path”,”Machine”)

ref: https://stackoverflow.com/questions/17794507/reload-the-path-in-powershell

Okay, check it like this

$env:Path.Split(“;”) | Where {$_.Contains(“yth”)}

The goal is to get all the sphinx cruft into a \doc\ folder. The whole mess will fight this every step of the way.

cd over to your source code folder (maybe) and run this. (I’m not sure about the best folder, initially I tried running from the c:\python33\source folder, which junks up the python directory. Which seems wrong.)


If it doesn’t run, your path is screwed up. This command creates a conf.py
When answering, most of the time you answer the default, except for the API doc stuff, which appears to be “n” by default. It appears that people use pydoc not just for python api documentation, but for autobiographies and they cater to that scenario first.

You’ll need to edit that conf.py

Install nano. You could instead open it with word, or some other inappropriate application. At least nano doesn’t kick you out of the console and it isn’t vim.

If the conf file lives in the same directory as your source code, uncomment this:

sys.path.insert(0, os.path.abspath(‘.’))

Or if the conf.py file is in a sub directory of your app’s directory, use two dots

sys.path.insert(0, os.path.abspath(‘..’))

Pause. Go edit your python files and add this to anything that will execute code (and not just define functions and classes) so you don’t get side effects when generating documentation. And add some comments of the for “”"blah”"” to your functions. I think this is the expected syntax for the doc comments.

if __name__ == ‘__main__’:

Create the rst files.

sphinx-apidoc -o api .

(maybe – . . would work better?)
I got modules.rst and foldername.rst. The modules.rst just has a reference to foldername.rst.

These .rst files are like editable “markdown” like things. They need further processing. Firstly, on my machine, when I run it, it adds the folder name as the package, and then I get a FolderName.filename package doesn’t exist error for each file on the make. So edit the .rst file and fix the package names.

Copy those .rst files into the root of your source control or where ever where ever conf.py is. I think. Put it in all the directories on the workstation if that doesn’t work. This is like working with java, nothing knows where anything is.

Run these, although I have no idea what they do:

sphinx-build -b html . build

.\make html

And it spits out some html. Cd to the build/html folder and run this to launch chrome.

Start-Process “chrome.exe” “index.html”

I have no idea if this works, my source folder and python folders are littered with crap created all over the place.

Official docs:
http://sphinx-doc.org/tutorial.html Quick start tutorial. If this works for you, then you are the sort of person that didn’t need docs in the first place. Sort of ironic.
https://codeandchaos.wordpress.com/2012/07/30/sphinx-autodoc-tutorial-for-dummies/ This is the more useful one, but not in itself enough to generate.
http://sphinx-doc.org/man/sphinx-apidoc.html You need this if you are using it for, I don’t know, python API documentation and not your autobiography.

TODO: Turn this into a .bat and .ps1 file.

Follow up Links
PyCharm has a built in understanding of Sphinx.
ref: Overview
ref: Python Integrated Tools Dialog.

REST Levels above 4

There is the Richardson Model of REST says REST APIs can be ranked like so:

1- Plain old XML. You serve up data in a data exchange format at HTTP endpoint. Ignores as much as possible about how HTTP was intended to work.
2- Resources have their own URL
3- Resources can be manipulated with GET, PUT, POST, DELETE
4- Resources return hypermedia, which contains links to other valid actions & acts as the state of the application.

I’ll add these:
5- Metadata. The API supports HEAD, OPTIONS, and some sort of meta data document like HAL
6- Server Side Asynch – There is support for HTTP 202 & an endpoint for checking the status of queued requests. This is not to be confused with client side asych. Server side asynch allows the server to close an HTTP connection and keep working on the request. Client side asych has to do with not blocking the browser’s UI while waiting for a response from the server.
7- Streaming – There is support for the ranges header for returning a resource in chunks of bytes. It is more like resumeable download, and not related to the chunk size when you write bytes to the Response. With ranges, the HTTP request comes to a complete end after each chunk is sent.

#5 is universally useful, but there isn’t AFAIK, a real standard.
#6 & #7 are really only needed when a request is long running or the message size is so large it needs to support resuming.

Clients should have a similar support Level system.
1 – Can use all verbs (Chrome can’t, not without JavaScript!)
2 – Client Side Caches
3 – Supports chunking & maybe automatically chunks after checking the HEAD
4 – Supports streaming with byte ranges

C# Basics– Classes and basic behavior

What upfront work is necessary to create fully defined classes? By fully defined, I mean, there is enough features implemented so that it play well with the Visual Studio debugger, WCF/WebAPI, Javascript and XML, and so on.

Ideally, this boilerplate would always be available to consume on your custom classes. In practices, it is uncommon to see any of the following implemented. Why is that?

So lets simplify reality and imagine that code is of only a few types:

(I’m suffixing all of these with -like to remind you that I’m talking about things that look like these, not necessarily the class or data structure with the same name in the .NET or C# spec or BCL)

  • Primative-like. Single value, appear in many domains, often formatted different in different countries. Sometimes simple, like Int32, sometimes crazy complicated like DateTime, sometimes missing, like “Money”.
  • Struct-like. Really small values, appear in some domains, like Latitude/Longitude pairs.
  • Datarow-like. Many properties,need to be persisted, probably stored in a relational or document database, often exchanged across machine, OS and organizational boundaries.
  • Service-like. These are classes that may or may not have state depending on the programming pradigm. They are classes with methods that do something, where as all the above classes, mainly just hold data and incidentally do something. It might be domain-anemic, like create, read, update and delete or it might be domain-driven, like issue insurance policy, or cancel vacation.
  • Collection-like. These used to be implemented as custom types, but with Generics, there isn’t as much motivation to implement these on a *per type* basis.
  • Tree or Graph-like. These are reference values that contain complex values and collection-like values and those turn also might contain complex values and collections.

All classes may need the following features

  • Equality- By value, by reference and by domain specific. The out of the box behavior is usually good enough and for reference types shouldn’t be modified. Typically if you do need to modify equality, it is to get by-value or by-primary-key behavior, which is best done in a separate class.
  • Ranking- A type of sorting. This may not be as valuable as it seems now that linq exists and supports .Sort(x=>…)
  • String representation- A way to represent this for usually human consumption, with features overlapping Serialization
  • Serialization- A usually two way means of converting the class into string, JSON, XML for persistence or communicating off machine
  • Factories, Cloning and Conversion- This covers creation (often made moot by IOC containers, which sometimes have requirements about what a class looks like), cloning, which is a mapping problem (made moot by things like automapper), and finally conversion, which is mapping similar types, such as Int to Decimal, or more like “Legacy Customer” to “Customer”
  • Validation- Asking an object what is wrong, usually for human consumption
  • Persistence- A way to save an object to a datastore. At the moment, this is nhibernate, EF, and maybe others.
  • Metadata- For example, the .NET Type class, an XSD for the serialized format, and so on.
  • Versioning- Many of the above features are affected by version, such as seralization and type conversion, where one may want to convert between types that are the same but separated by time where properties may have been added or removed. Round trip conversion without data loss is a type of a versioning feature.

How implemented

  • Ad hoc. Just make stuff up. Software should be hard, unpredictable and unmanageable. The real problem is too many people don’t want to read the non-existent documentation of your one-off API.
  • Framework driven. Make best efforts to find existing patterns and copy them. This improves your ability to communicate how your API works to your future self and maybe to other developers.
  • Interface driven. A bit old fashioned, but rampant. For example these:
    //Forms of ToString(), may need additional for WebAPI
    IFormattable, IFormatProvider, ICustomFormatter,
    //Sort of an alternate constructor/factory pattern
    IDisposable, //End of life clean up
    IComparable, IComparable, //Sorting
    //Competing ways to validate an object
    IValidatableObject, IDataErrorInfo,
    //Binary (de)serialization
    ISerializable, IObjectReference
  • Attribute driven. This is now popular for seralization APIS, e.g. DataContract/DataMember and for certain Validations.
  • Base Class- A universal class that all other classes derive from and implement some of the above concerns. In practice, this isn’t very practical, as most of these code snippets vary with the number of properties you have.
  • In-Class- For example, just implement IFormat* on your class. If you need to support 2 or more ways of implementing an interface, you might be better off implementing several classes that depending on the class you are creating features for.
  • Universal Utility Class- You can only pick one base class in C#. If you waste it on a utility class, you might preclude creating a more useful design heirarchy. A universal utility class has the same problem as a universal base class.
  • Code generation. Generate source code using reflection.
  • Reflection. Provide certain features by reflecting over the fields and properties.

All of these patterns entail gotchas. Someday when I’m smarter and have lots of free time, I’ll write about it.