The Joyent years, 2018-2021

Today was my last day at Joyent.

It has been a weird 3 years and 3 months, but I enjoyed the experience, and learned a great deal along the way. As is always the case, I’m really going to miss the people the most; both those there now, as well those I’ve worked with in the past. The company has used the term “Joyeurs” to describe its staff, and as much as I’ve never really liked the word, Joyent has managed to attract a certain kind of mind, and the people I’ve worked with during my time there have been lovely.

When I moved from Oracle, Solaris was in the twilight of its software lifecycle with a reduced amount of staff and resourcing dedicated to the operating system, and I felt sad about that.

Earlier at Oracle, one of my assignments was the Solaris kernel tech-lead role, and having had a hand in a sightly stressful build-system rewrite before that, I was a little burned out. I ended my tenure there working on the Solaris compliance(8) subsystem, which I enjoyed, but increasingly, it felt like a change of scenery would be a good idea. Several of my former colleagues had moved to Joyent and it looked like Joyent was assembling quite the team. I wanted to help out.

Perhaps this would be my next “for-life” engineering job? Having spent 21 years at Sun/Oracle, I wasn’t looking for a short-term lease, though things don’t always work out the way we hope.

I started out with an eye on build systems, the work I was really looking forward to taking on. We filed RFD 145 as an effort to improve engineering productivity for Manta and Triton development. That effort took just over a year, but I think it was well worth it, making the build system a lot less error-prone, and easier for engineers to use. After that, I worked on various bits of internal tooling, and improvements to the way we were using Jenkins.

I also had a hand in some Manta and Triton fixes, a bit of work in the SmartOS build system, and had the opportunity to learn (and, to be honest, loathe) node.js along the way. Given Joyent’s open-source-led approach to software engineering, you can see almost everything I worked on during that time, from 2018 to 2019/2020 on my GitHub page.

Still, as was suggested to me recently, it started to feel like I might have joined about 5 years too late.

While we had Manta, with its ahead-of-its-time co-located compute capability on objects, as well as the later “v2” variant (which sacrificed that capability, in favour of a more efficient metadata tier) neither solution was the approach our engineering organisation was eventually directed towards. A similar call to not continue with Joyent’s public cloud offerings, started to suggest that Joyent maybe wasn’t quite the long-term prospect I’d hoped for.

Several of the old guard at Joyent left to form Oxide, and with notable staff departures to different companies, our work environment had changed. Nevertheless, a new org was formed and hiring ramped up to contribute to a different, mostly closed-source, in-house object-storage system. We were to build-out that existing system, aiming eventually at the scale that Manta had previously delivered on. While that code was allegedly performing well on small configurations (running on very fast hardware that Manta never had access to, which always felt like a significant thumb on the scales to me), there was work needed to deploy it at a larger scale.

Curious to see what life after Manta would be like, my last year at Joyent was spent working on containerizing that system, writing new build infrastructure and new Ansible code to deploy it. I came up with some more efficient roles and playbooks, and now that we had the system composed of container images, we had the semblence of something that could work in production. This was all targetting CentOS on baremetal (for unclear reasons.) The side effects of that work were that I got to learn Ansible from the ground-up, and gained more Linux experience, for better and for worse, though I often missed the capabilities we had in SmartOS.

Over the past year, I’ve enjoyed drawing on the earlier Manta and Triton build work for this new solution. Growing a project that started as a bare-bones 2-week hackathon entry into something that we could legitimately consider for production was quite rewarding. Although there’s work still to be done, I hope the fundamentals are established. Getting the chance to write a simple build system from scratch, and getting to design it the way I talked about build systems from the beginning of my time at Joyent was a pleasure. I hope that at some point, it gets deployed to the scale we envisioned, and maybe it’ll eventually get released to the public.

With that work now in the past for me, no doubt my former colleaues will drive on with it, I’ll be taking July off to decompress, but am looking forward to my next job, and expect to be traipsing through an entirely different build infrastructure very soon.

Land Rover Bulkhead Replacement 2020/2021

As promised, here’s the blog post about my year-long, keep-timf-sane-during-the-pandemic Land Rover project. The work was carried out by David Faulker of DF Engineering at his workshop in Killinchy. I assisted by scouring Ebay and the Land Rover Series 2/2A parts manual in particular, and the internet in general to find the parts we should be using, and having many phone calls and occasional socially distanced meet-ups with David to gently nudge the project in the right direction.

This post was mostly written as a series of tweets, but after 39 of those, this very much became one of those “Just write a blog post, Tim” updates. As such, just imagine you’re reading this post as a series of tweets: not much editing has been done since I drafted them.

I dropped off the truck in February 2020 before the world changed (the 5-month delay in getting the new wiring loom due to the manufacturer going into lockdown was probably the largest delay) I picked the truck up at the end of April 2021, just a few days ago. What a weird year.

Where to start? Well, back in 2017, I bought this truck almost “by mistake.” When we moved here from Wellington NZ, I’d promised myself that since we are now able to afford an actual garage, it should really hold a classic car.

After some time looking at E30 and E34 BMWs, my wife’s cousin was selling this old Land Rover, and I thought “well, that seems more practical, and easier to maintain.” Well, we live and we learn, don’t we.

The truck was bought by the previous owner in 2008 as a “rebuilt 1961 88″ Series 2A on a new galvanized chassis.” This statement hides several complications unfortunately.

Here’s what it looked like in 2008:

Not looking much like a 2A

To the man on the street, it’s an old Land Rover, but you’ll notice the plastic grill, the flat door hinges, the lights-on-wings (despite those only moving there in 1969) and Defender-era rear side windows, all of which suggests that either a lot of work has been done on this Series 2A, or that there’s something else going on.

In fact it looks very Series III (as did all of the interior, given that it had a Series III bulkhead) rather than the advertised Series 2A, but I only learned about those differences after buying it. And of course since it had a 1990’s 200tdi engine, “original” is never a word we’re going to use for this truck.

However, there’s lots of positives: a 200tdi with an overdrive on a Series Land Rover is much more practical than a 2.25 diesel. A galvanised chassis won’t dissolve in the rain, and because it’s not a museum piece, I can enjoy it without being too precious about it.

But.. Here’s the problem, a few years later:

Feel the rot

With that rotten bulkhead, a simple bit of welding wasn’t going to cut it. The driver’s side floor panel was the worst, and there wasn’t enough good metal to weld a new panel to.

As we dug, the extent of the rot became clear. One wonders how it got so bad after only 10 years? Perhaps it was really an old Series III body placed on a salvaged Series 2A chassis, which was then replaced with the new galvanised one? This was a common (if slightly questionable) practice back in the day to cause vehicles to drop into the “historic vehicle” tax bracket, but we’ll never know for sure. Either way, Series IIIs are now already in that bracket, so it seems a shame to have done that rather than restore the Series III vehicle to its original condition.

So what to do? My dilemma was should I replace the rotten bulkhead with a Series III part, and continue to live with the reg-plate/appearance mismatch, or fit the correct-for-year bulkhead, along with an interior that matches, despite knowing that it wasn’t really an original Series 2A?

Well, the cheapest and easiest option was to stick with the Series III bits and reuse what I had, but I felt this was the wrong direction.

The chassis number dates from 1961, so shouldn’t we have the rest of the car follow suit? And so, with that rather expensive decision made (deep breath, hide the bank statements), David set to work, gradually uncovering more rot as he went:

Nice bit of structural rust holding the bottom door pillar on.
A pretty rotten looking radiator support.

On the other hand, how much money do you throw at a restoration that’s never going to win prizes for originality?

I decided that what I liked about old Land Rovers was their spirit. A metal dashboard is fine. Simple manual controls are ok. If the reg-plate says “1961”, let’s fit tech for that era (with minor exceptions we’ll cover later) So, here we have more compromises. We now have a 1960s bulkhead (yeah, a suffix D, so not quite right for the year) keeping the old gauges, again, not early 2A gauges, but they were fitted to late 2As, so that’s fine, and all in all, I’m really happy with the result.

We have 5.5″ rims from a 109, but with modern radial 245x75R16 tyres that aren’t going to get me killed. Extra fun trying to get the bonnet-mounted spare, but less of a load on the rear door)

And back from the paint shop! Land Rover described various hues as "Limestone" over the years, so we're going with a whiter shade this time. So long as we can colour match the roof when we eventually do the respray, that's fine by me.
Hefty tyre mounted on bonnet. I hope the frame can cope with it. Likewise, I look forward to weight-lifting sessions to get the bonnet open.

We have stainless bolts everywhere, with Allen key heads where that makes sense, rather than simple bolts or sherardized flat-head screws.

We have standard 12v car jacks rather than the old Land Rover inspection-lamp sockets. The red and black sockets on the instrument panel are cosmetic because

a) nobody uses those sorts of connectors anymore

b) “yeah, let’s have a live electric terminal exposed on a metal dashboard, what can possibly go wrong?”)

I'm not sure I really need two 12v sockets, but perhaps it'll come in handy. I removed the text off the lower one as an experiment, but suspect I'll remove it from the top one as well.

We have terrible but expensively refurbished individual Lucas FW2 wiper motors. No self-park, you get to turn them on individually, and a 90⁰ sweep angle is for the brave (or foolish)

Regurbished Lucas FW2 wiper motors in really lovely condition. As was becoming a pattern, these were an expensive eBay purchase. I hope they work!

We have a new https://claytonclassics.co.uk heater (better than my old Series III heater ever was) acting as a drop-in replacement for the correct Smiths “ankle-burner.”

Working out where to locate the heater was tricky - we had to avoid the 200tdi starter, and the portion of the bulkhead we needed to cut out to fit.

We didn’t stop with the cosmetics. There’s lots lots more, including a sand blasted and repainted chassis, new radiator (the old one was in a terrible state), new radiator panel and refurbished flat valance, new thermostat, new brake servo and slave cylinder, a new wiring loom, new lights all-round, the list goes on. As did the invoices.

A pretty rotten  looking radiator.

Looking back at the project so far, I hope I’ve walked the line between properly representing the model era, with a nod to modernity.

There is more work to do – I need to replace the rear door, which is starting to rot (and it was already in a bad state, cracking at the corners as a result of trying to carry a 205×80 tyre that was too heavy for it) and I’d really like to replace the rear side window panels with something more period-correct. After all that, it’ll need a respray so that we have period “Pastel Green” on all panels, but for the time being, I’m going to enjoy the truck over the summer, COVID-permitting.

Eventually, I also expect to replace the 200tdi diesel with an electric motor and batteries, but perhaps not for a few years.

So there we go – a very quick summary of a very slow Land Rover rebuild. There’s loads more photos at https://timfoster.smugmug.com/Land-Rover/Bulkhead-Replacement so please shout if you’ve got questions on any of this.

Temporary, for large values of “temporary”

Today is my last day at Oracle, after nearly 22 years working on and around Solaris.

I started at Sun Microsystems, Inc. in June 1996. Working there was literally a dream come true and I still count myself lucky to have had the opportunity to work on the best UNIX of its era.

I was an intern at Sun for a few months before accepting a permanent position in Nov ’96, with a fairly low sunid in the 23000’s. Here’s what my badge looked like when I started:

Tim's Sun intern badge

My plan now is to take two weeks to unwind and then start a new job, which is going to feel pretty weird, but .. you know, good-weird :-)

Project Lullaby: build(1) log files

I’ve written before about Project Lullaby, the revamp we did on the Solaris ON build system for Solaris 12 11.4, and in that post, I mentioned in passing the new logging capabilities of the build, but didn’t go into much detail. This post is an attempt to address that.

I spent a long time getting build(1) to produce good logs – they’re the first place a developer will look for problems, other than the build notification email, which I’ll mention later. Clear build logs massively improve developer productivity and the converse is unfortunately also true, so it was worth taking time to get this right.

Separate log files

Due to the fact that we broke the original monolithic build script into separate tasks, it was obvious that each build task should have its own log file. Before Lullaby came along, the build would produce a single giant “nightly.log” file, which was a little unwieldy when trying to find the errors from one particular part of the build.

Apart from just per-task logs, we went further and made sure that where applicable, when a given task runs debug and non-debug phases, we’d produce separate log files for each phase.

Log format

A clearly readable log format was important – I wanted every line to be timestamped, and for every line to include a log level that roughly indicated what had produced the log line: build(1) itself emitting some information, starting/completing a task, stdout from the task, and warning/fatal errors caught by build(1).

For certain build tasks, we wrote custom log classifiers so that we’d be able to more accurately produce logs from those tasks.

Other than emitting the logs in plain text (grep is important!) we also produced html logs, with rudimentary syntax highlighting so that errors would jump out nicely. The syntax highlighting used the log level information I mentioned earlier, but also tried to detect:

  • a command being invoked by build(1)
  • stdout from dmake where we were executing a command
  • stdout that didn’t look like we were executing a command
  • we’re invoking the shadow compiler
  • we have detected errors in the output
  • we have entered a new directory in the source tree

Each line of the html log includes a unique html anchor, allowing you to easily share links to specific lines with your long-suffering colleagues.

Of course, when you’ve got an 80mb plain text file converted to html, that’s likely to bring most web browsers to their knees, which brings me to …

Log extracts

The above were both important in producing easily readable log files, but we wanted to go further still.

If the first thing a developer does when encountering a build failure is to grep the logs for "*** Error code 1", we decided build(1) should be able to do for you and produce a log extract that doesn’t cause your text editor to eat all your ram.

Typically with nested invocations of dmake(1), the error that eventually causes the build to fail is quite a way above that final error status, so being able to emit those interim errors is important for fast problem detection. To find these errors, and elide all of the irrelevant content, we scaned through the failed log file, writing log lines to a ring buffer. When we hit an error, we’d seek ahead a few lines to add context, then write each chunk to the output file.

Of course, since the log file extraction code used the build(1) logging framework, we’d get html versions of these extracts too.

An example of one such log extract is here (the error in this case was a missing file commit, iirc)

mail_msg

Finally, we spent a while getting the build notification mail format right.

Now that the build was broken into specific phases, it made the mail_msg a lot easier to structure.

Before Project Lullaby and build(1) came along, it was usually touch and go as to whether the build notification mail would give any useful indication of the actual error, forcing you to login to the build machine and start grepping through the build log.

With the log extracts we’d computed, there was a much bigger chance that the mail_msg could contain the actual reason the build failed. Of course, we also didn’t want to flood the poor gateling’s inbox, so we limited the size of each build task’s extract, and there’s probably more we could do to improve there.

Here’s the rough structure of the mail_msg we came up with:

  • Subject: overall status, $MACH, $VERSION, $PKGVERS and build machine
  • overall build status
  • start and completion timestamps
  • overall build time
  • build configuration file used
  • OS/Net $VERSION
  • $PKGVERS
  • mercurial changset ids for each source repository and branch name
  • build machine configuration (solaris ‘entire’, ‘system/core-os’ and ‘developer/opensolaris/osnet’ package versions)
  • A list of tasks executed by the build, along with a ‘pass|fail|fail (ignored)’ indicator
  • Short log extracts for every failed task in the list

I included a sample mail_msg in my earlier post, but here’s another one for completeness:

Build summary for server i386 nightly build of /space/timf/jenkins/workspace/on-nightly

Build status    : pass (RTI issues)
Started         : 2018-02-22 20:47:31
Completed       : 2018-02-22 22:45:24
Total time      : 1:57:52
Build config    : /space/timf/jenkins/workspace/on-nightly/bldrc
Build version   : on-nightly-553:2018-02-22
Build pkgvers   : 11.4.0.1.0.16.40261

. changeset     : 634226c757d4 (server)
usr/closed cs   : ccb4732dee60 tip
usr/man cs      : 8961711c9b71 tip
usr/fish cs     : [repository not present]

Start timestamp : 1519361251.57

Build machine details:
Identity        : timf-build (i386)
WOS pkg         : pkg://solaris/entire@11.4,5.11-11.4.0.0.0.10.0:20171127T195212Z
core-os pkg     : pkg://solaris/system/core-os@11.4,5.11-11.4.0.0.0.11.1:20171211T162540Z
osnet pkg       : pkg://solaris/developer/opensolaris/osnet@11.4,5.11-11.4.0.0.0.11.1:20171211T162033Z


=== Task summary ===
server-tstamp         : pass
server-check.rti_ready: fail (non-fatal)
server-save_packages  : pass
server-clobber        : pass
server-tools          : pass
server-setup          : pass
server-man            : pass
server-install        : pass
server-packages       : pass
server-pkgsurf        : pass
server-pkgdiff        : pass
server-setup-nd       : pass
server-install-nd     : pass
server-packages-nd    : pass
server-pkgsurf-nd     : pass
server-pkgdiff-nd     : pass
server-check          : pass
server-check.protocmp : pass
server-check.elf      : pass
server-check.ctf      : pass
server-check.lint     : pass
server-check.cstyle   : pass
server-check.cores    : pass
server-check.findunref: pass
server-check.paths    : pass
server-check.pmodes   : pass
server-check.uncommitted: fail (non-fatal)
server-check.install-noise: fail (non-fatal)
server-check.lint-noise: pass

check.parfait         : disabled


=== Task details ===
.
. (I'm omitting these for brevity)
.

All done

When the build completed or was interrupted, we’d move all the files for that build into a separate log.[timestamp] directory, and update a “last” symlink that points to that new directory before sending the final mail notification.

Finally, if we’re running individual build tasks rather than the full build, we’d rename any existing build logs to take a timestamp suffix to avoid destroying any evidence.

Implementing the logging part of build(1) was one of the most satisfying parts of the Lullaby project for me – making lots of developer’s lives easier is a good thing and I believe Solaris development really benefited from that work.

Blog like a tech lead: 3.25 years in the Solaris OS/Net core team

Ok, I told a white lie in my previous entry – here’s one last post to end this series. To recap, here’s what I’ve written so far about my time as OS/Net tech lead for Solaris:

  1. Party Like a Tech Lead
  2. Project Lullaby
  3. Ensuring the quality of ON deliverables
  4. Maintaining and improving developer productivity: heads-up and flag days
  5. Being a tech lead and dealing with people
  6. Drinking from the information firehose: a tech lead’s job

From all of the above, I suspect you’ll have realised that contributing to Solaris is something I really care about. I’ve been working on this OS in various forms since 1996, and have seen it go through ups and downs througout that time.

The things I bring away from my time on the core team, are not just the joy of helping to put together the bits and bytes of the operating system itself, nor just the sense of satisfaction I got from working on something that’s important to our users – the best part of the job, was the people I got to interact with on a daily basis.

Many of these developers, managers, testers and technical writers have been working on Solaris (like myself) for a long time. Many others are recent additions to our engineering staff, but there’s a common trait that unites us: a desire to develop a good operating system – one that doesn’t cut corners, one that fails gracefully under load, one that performs and scales well on a variety of platforms, is easy to install, secure, administer, upgrade and debug, and ultimately, shields its users from having to worry about anything other than running their applications.

Every platform should be like this, whether that’s a traditional OS, or a “blah-as-a-service” entity running in somebody else’s datacenter.

Whatever I end up working on next, I’ll be privileged if I get to do it alongside those same engineers, and if not, I’ll try to bring a few Solaris-worthy values along with me – they’re worth propagating.

Drinking from the information firehose: a tech lead’s job

This is the last of my posts documenting some of the tips and tricks I found useful during my time as Solaris ON technical lead, which hopefully might help others thinking of taking up similar positions.

I’d originally planned to talk about project reviews in this post, but since I ended up covering that last time, I’m instead going to talk a different challenge that you’ll face when doing this sort of job.

My transition out of this role took a little longer than expected. The first engineer we’d intended as my replacement got lured away to a different part of the company. However, one of the questions this engineer asked when I was explaining the sources of information he needed to keep track of, was

“How on earth do you keep up with all of this stuff?”

Here are some of the sources I was talking about:

  • every commit to the source tree, and a rough idea of its content
  • the above, except also for the Userland and IPS consolidations
  • every bug filed
  • every bug that lands on the Release Criteria list
  • every bug that might be considered a Release Stopper bug
  • every new ARC (Architecture Review Committee) case filed, and how the discussion is going
  • every IRC conversation about the gate and problems we discover in Solaris
  • every test report
  • every mailing list post (ON, Userland and IPS-related, along with a host of other Solaris-related lists)
  • every mail addressed specifically to the gate staff
  • every project being submitted for review
  • every build failure

This amount of information is clearly a recipe for insanity, and perhaps one of the reasons why we try to rotate the tech lead position every 18 months or so, but there has to be a way of dealing with it all while keeping your head.

I’ll talk about some of the techniques I used a little later, but first, I want to explain why all of that information is useful.

These sources helped me maintain a general sense of project awareness, so that if someone was talking about a bug they’d just encountered, I’d be able to recall whether that subsystem had changed recently, or whether that problem sounded similar to a bug that had recently been filed. If people were seeing build failures, I’d hopefully be able to recognise them from the ones I’d seen before.

Gathering and keeping track of all that information often helped answer questions that triggered my “that’s funny” response – a “spidey sense” for the gate, if you like. Even if the problem being described didn’t exactly match one I’d seen before, just getting an inkling that a problem sounded similar to another was often a thread we could pull to unravel the cause of the mystery!

I felt that one of the roles of a tech lead, is to act as a sort of human information clearing house. I wasn’t expected to know the nitty gritty details of every subsystem in Solaris, but I was expected to know enough about what they do, the impact of them breaking, and who to call if that happened.

[ Aside: I am not claiming to be a “Full Stack Engineer” – honestly, the concept seems ludicrous. Yes, I’m sure I could write a device driver given time to learn. I think I could write a half-decent web-ui if needed, and sure, I could maybe debug kernel problems or resolve weird Django problems, but I wouldn’t be the best person to do any of those things: humans specialise for a reason. ]

So, with so many sources of information, how do you keep track of them all?

Read

There’s no escaping it – you need to spend a lot of time reading email. As much as I agree with my colleague Bart Smaalders, who maintains that “You will contribute more with mercurial than with thunderbird”, that spending your time arguing on mailing lists eventually becomes counter-productive without code to back it up, but reading email is still a necessary evil.

Everyone has their own way of filtering mailing lists – mine was to sort roughly by category (not mailing list) and use Thunderbird tags to assign those colours and move mails into subfolders for searching later. I’d mark those deleted mails with strikethroughs in my inbox, rather than simply hiding them until the next time I compacted my mail, but each mail message Subject: line would retain the color I assigned to it.

This meant that on a typical morning, at a glance I could quickly tell which categories were active, based on the different rainbow-hue of my inbox. A mostly blue day meant a lot of bug updates. Red days were security-related issues, orange meant lots of gatekeeper@ activity, etc. Most days, if you mixed the colours together, would have been a muddy brown colour, no doubt!

Next, our putback mails are normally just the standard ‘hg notify‘ messages, however we have a separate mailing list that also includes full diffs of each commit – I subscribed to those, and used to scan each commit as it arrived.

I’ll talk a little more about a tool I wrote to keep a sense of gate awareness later, but despite that, having mail notifications for each putback and convenient access to its content was still incredibly useful.

Bugs

My predecessor, Artem Kachitchkine wrote a wonderful daily-bugs web page, which I’d check each morning to see what issues were being filed, concentrating on higher priority bugs, or those with interesting synopses. I’d keep in mind what recent changes had integrated to the source tree and, for severe problems that were likely to impede testing, I’d often get in touch with the submitter or responsible engineer straight away.

This web UI allowed you to quickly page back to previous days, and this chronological view of bugs was really useful. (e.g. “Oh, that bug was filed before this particular change integrated”, without having to formulate bug database queries)

Several years ago, Solaris transitioned from the old Sun “bugster” database to Oracle’s internal bug database. This made us happy and sad at the same time. The downside was that the web UI was (and still is) utterly horrendous, however, the really big upside, was that everyone finally had raw SQL access to the database, something that was only available to a select few at Sun.

So, wanting to avoid that terrible UI, I wrote a simple mod_wsgi-based REST-like bug browser, allowing me to form simple queries of the form:

http://<server>/report/<category>/<subcategory >

to browse bug categories or view open bugs against a specific subcategory. I added a count=n parameter to limit the number of results returned, a rfes=true|false to optionally hide Enhancement requests (a separate class of bug report), and a special “/all” URI component to show all bugs in that category.

I added a “/tags” URI component that would ‘AND’ subsequent URI components to show bugs with specific tags, e.g.

http://<server>/report/utility/filesystem/tags/slp64

would show bugs against filesystem-related utilities that were marked as being interesting for 64-bit conversion.

Very often, simply being able to quickly find the most recent bugs filed against a specific subsystem was terribly useful, and in cases where a bug could feasibly be filed against a series of areas, having fast-access to those was great.

Chat logs

Keeping local logs of the main ON chatroom was especially useful to me as a non-US developer. Each morning, I’d catch up on the previous night’s conversation to see what problems people were reporting, and would often start investigating or sending mails to see if we could make timezones work to our advantage and have the problem solved before the US folks woke up.

Having those in plaintext meant that I could grep for cases where I thought we’d discussed similar problems (possibly) years ago and quickly be able to find answers.

Spelunk

To help out with my gate-awareness, I wrote a tool that would allow me to quickly cross-reference source tree commits with the bug database, and to track each commit’s impact on the “proto area” (the set of files and binaries from a build that made up the packages Solaris was constructed from)

In a not-quite text-book example of “Naming things is hard”, I unfortunately called this utility “spelunk”, trying to evoke the sentiment of source code archeology I was aiming for (but no, absolutely nothing to do with splunk – sorry)

I mentioned before that everyone had SQL access to the bug database, so this was just a case of writing a small SQLite-based application in Python to suck in the metadata from Mercurial changesets pushed to the gate, the results of a very very debug-oriented compilation of the source tree (so each compiled object contained DWARF data showing which source files each had been compiled with) along with a few other heuristics to map files from the proto area to the source tree.

Once that was done, I just needed a Jenkins job to update the database periodically and we were in business!

I then wrote a simple shell-script front end to the database that would allow me to execute arbitrary pre-canned SQL queries (looking up a set of files in a shared “query.d/” directory) ending up with a useful CLI that looked like this:

$ spelunk --help
usage: spelunk [options] [command]

options:

	-d|--database	Show when the spelunk database was last updated
	-h|--help       Show help
	-l|--list       Show known subcommands
	-p|--plain      Use simple text columns
	-s|--sqlite     Open an sqlite3 shell
	-v|--verbose    Verbose (show SQL)
	-w|--web        Format the output as html

$ spelunk

spelunk: no command listed. Use -h for help.

Commands:

NAME              DESCRIPTION
----              -----------
2day              putbacks from the last two days 
2dayproto         proto files changed in the last two days
2daysources       sources changed in the last two days
blame             who to blame for this source file (needs arg1)
bug               show this bugid (needs arg1)
build             fixes made in a specific build (needs arg1, a gate tag)
changeset         Show the bugs, sources and proto effects for this changeset (needs arg1)
lastbuild         fixes made in the build we just closed
proto             the proto files associated with this source file (needs arg1, a source file)
protobug          bugs and cat/subcat for this proto file (needs arg1, a proto file)
protochangeset    what this changeset does to the proto area
recentfilechanges recent changes to a source file pattern
source            the sources associated with this proto file (needs arg1, a proto file)
sourcebugs        bugs for this source file (needs arg1)
synopsis          show the bugs matching this synopsis (needs arg1)
tags              tags added to $SRC
thisbuild         fixes made so far in this build
today             fixes done in the last 24 hours
todayproto        changes to the proto area in the last 24 hours
todaysources      sources changed in the last 24 hours
week              fixes done in the last 7 days
weekproto         proto files changed in the last 7 days
weeksources       sources changed in the last 7 days
whatbuild         determine what build a bug was fixed in (needs arg1, a bugid)
whatcat           determine what category to file a bug against (needs arg1, a proto path)
yesterday         fixes done yesterday (ish)
yesterdayproto    proto files changed yesterday (ish)
yesterdaysources  sources changed yesterday (ish)

This was an excellent way of keeping on top of the ebbs and flows of development in the gate, and I found it really helpful to be able to write structured queries against the source tree.

With some work, Mercurial revsets would have helped with some of these sorts of queries, but they couldn’t answer all of the questions I wanted to ask since that required knowledge of:

  • mercurial metadata
  • bugdb metadata
  • a mapping of source files to compiled binaries

Most of the time, the CLI interface was all that I needed, but having this database on hand ended up being incredibly useful when dealing with with other problems we faced during my time in the job.

Here’s some sample questions and the resulting database query:

1. What delivers /etc/motd?

sqlite> .header on
sqlite> SELECT * FROM proto_source WHERE proto_path='etc/motd';

2. What bug category should I use to log a bug against /usr/sbin/svcadm ?

sqlite> SELECT cat, subcat, COUNT(cat) AS cat_count, COUNT(subcat) AS subcat_count
   ...> FROM bug, bug_source, proto_source WHERE
   ...>     proto_source.source_path=bug_source.path AND
   ...>     bug.bugid=bug_source.bugid AND
   ...>     proto_source.proto_path='usr/sbin/svcadm' ORDER BY cat_count, subcat_count LIMIT 1;

3. What recent changes happened that might have caused zpool to core dump?

sqlite> SELECT DISTINCT bug.bugid,changeset,synopsis,date(date, 'unixepoch') FROM bug, proto_source, bug_source WHERE
   ...>     proto_source.source_path=bug_source.path AND
   ...>     bug.bugid=bug_source.bugid AND
   ...>     proto_source.proto_path='usr/sbin/zpool' ORDER BY date DESC LIMIT 10;

4. I’m seeing a VM2 panic – who should I to talk to?
(actually, you could probably use the bug database, our developer IRC
channel or a mailing list for this, but still…)

sqlite> SELECT bugid, synopsis, author from bug
   ...>     WHERE synopsis like '%vm2%' ORDER BY date DESC LIMIT 10;

5. Show me some recent changes that affected sfmmu.

sqlite>  SELECT bug.bugid, path, date(date, 'unixepoch'), changeset from bug
   ...>     NATURAL JOIN bug_source WHERE path LIKE '%uts/sfmmu/%' ORDER BY date DESC LIMIT 10;

6. What changes were made to RAD in July 2015?

sqlite> SELECT changeset, DATE(date, 'unixepoch'), path, author FROM bug
   ...>     NATURAL JOIN bug_source WHERE DATE(date, 'unixepoch') BETWEEN date('2015-07-01') AND DATE('2015-07-31') AND
   ...>     path LIKE '%/rad/%';

7. Did we make any changes to the CIFS kernel modules since s12_77?

sqlite> SELECT DISTINCT bug.bugid, date(date, 'unixepoch'), cat, subcat, synopsis, changeset FROM bug, bug_source
   ...>     NATURAL JOIN bug_source WHERE date > (SELECT date FROM tags WHERE name='on12_77')
   ...>     AND path LIKE "%uts/common/fs/smb%" ORDER BY date DESC;

(of course, this wasn’t perfect – the fact that Mercurial changesets don’t include integration-date but instead changeset-creation-date means the date-related queries weren’t quite right, but they were often close enough.)

People

It’s not all about tools, of course, which brings me to the last, and most important way of dealing with the information you need to process when doing this job.

Being able to have an answer to the Ghostbusters question (“who ya gonna call?”) when trying to assess the severity of a bug, or understand some breakage you’re seeing will really help to resolve issues quickly. As I’ve said already, one person can’t possibly debug every problem you’re going to face, but having a set of helpful contacts makes life immeasurably easier.

When talking to engineers about problems, be ready to point to an error message, a crash dump, or ideally a machine exhibiting the problem – that will make them much more willing to drop what they’re doing and quickly take a look. Just remember, these folks have day-jobs too, so do try to work out the problem or file a bug yourself first!

Finally, if you’re one of the many people I’ve pestered over the last 3.25 years, thank you so much, you’ve made working in Solaris a better place!

Being a tech lead and dealing with people

This is the second last post in my series about my 3.25 years as Solaris ON tech lead at Oracle.

Originally, this post was titled “arbitrate technical disputes”, but honestly, looking back, I actually didn’t have to do this terribly often. So, I’m just renaming it to “dealing with people”, and will offer some thoughts on my approaches.

For the most part, I was lucky to be dealing with an organisation of terribly smart engineers who were well-capable of resolving technical disputes without any help from me.

On the rare occasions I was asked to weigh in, I tended to find an easy path through the maze just by reframing the statement:

“If I was a Solaris user with a substantial investment in the operating system, and ran into this problem (or something related to us choosing one approach over another) in production, what could the impact be, and how much would this ruin my day?”

When you address problems that way, the solution often becomes clear – it may not be the path that everyone wants, but ultimately our users trust that we have developed the best OS possible, and expect us to have made the right call.

Of course, not everything comes down to a technical decision. There were times when I disagreed with the readiness of code which was to be integrated, with teams either promising they’ll fix all the bugs after they putback or citing schedule reasons for wanting to integrate before the project was ready. On those occasions when my advice was overridden, sadly, I was often proved right.

More frequently, the reasons I most frequently needed to interact with people tended to fall into two distinct categories:

  • code and project reviews
  • dealing with gate breakage

and it’s worth talking about them both

Code and project reviews

Apart from the technical aspect of performing code reviews, which I’m not going to talk about, I feel there’s a social skill needed to review changes. Realising that the engineer who has asked you for a code review believes you to be a peer or senior engineer already implies an amount of respect should be given to the request from the start.

At the same time, it’s also good to remember that reviewing people’s code really does make you a better programmer – it’s easy to fall into the trap of thinking you’re “getting nothing done” simply by reviewing other engineer’s code, that’s just not true: you’re advancing the state of the operating system by helping your colleagues succeed.

The other thing I try to bear in mind, is that the engineer who has asked for a review likely feels like this is their best work – coming across as especially disparaging or severe in your review comments is not usually a good way to get people to improve. Don’t be an asshole.

If you’re continually getting poorly thought out code sent to you for review, it might be time for a chat with the submitter. I’ve always tried to be sensitive in my comments during code reviews, offering suggestions when they’re needed, while at the same time not spoon-feeding all of the answers.

[ In the dim and distant past, one code review I sought as a junior engineer from a very senior engineer was met with this response: “This section is wrong, but I’m not going to tell you which part. Read this book, then tell me what the right answer is, and why.”  That response still sticks with me today, and I’m a better engineer for it. ]

Project reviews are slightly different from code reviews – at this point in the Solaris development process, the code has already been reviewed by subject-matter experts, test plans have been written, testing performed and results have been produced and reviewed. All of the bug database book-keeping and potential legal review has already occurred – we’re on final-approach.

At this point, as a technical lead, you’re being called in to offer your opinions and experience at a high level on all of that work. Are there other projects which recently integrated which ought to be tested with this change? Did the original reviewers consider that subsystem when looking at the changes? Are the build logs absolutely squeaky clean, and did the project team recently sync with the gate?

Reviewing projects is really just something that takes experience, and if you have the opportunity, sitting in on a project review, or reading ARC cases as they are discussed is an incredible way to learn more about engineering.

Dealing with gate breakage

From time to time, someone integrates a change which breaks the build. Despite all the tools and processes we have in place, humans can still be humans, and as tech lead I often had to get in touch to break the bad news.

From a technical perspective, the fix was usually simple: revert the change from the gate, or pester the responsible engineer for a fix.

From a personal interaction perspective, there were a few ways of dealing with this.

My approach was always to be as reasonable and professional as possible – the chances are, the engineer didn’t mean to break the gate, and they’re likely already hugely embarrassed that they’ve done so because their integration has likely been pulled into workspaces right across the company by the time we discover it.

Putting extreme pressure on them at that point would be unlikely to help matters, so I’d usually give them time and space to work out the problem, or make the call to revert the change. This is a different approach than others have used in the Solaris OS/Net consolidation in the past (oh, the stories!), but is very much the one we’ve been trying to encourage in the gate.

In my opinion, the most important thing, is not whether an engineer makes a mistake, but it’s how they recover, their attitude towards owning up to the problem, what they do to prevent themselves from making the same mistake again, and how we can make the build more likely to flag those problems.

 

To sum up – being a tech lead isn’t all about the code – you need interpersonal skills, and dealing with people in a kind and respectful way will make your job immeasurably easier. Don’t be an asshole.

Maintaining and improving developer productivity: heads-up and flag days

This is the next in the series of posts I’ve been writing about my role as ON technical lead for the Solaris organisation for the last 3.25 years.

I talked before about Project Lullaby, and the build tool improvements we made, along with some testing improvements – both of which certainly count as improving developer productivity.

What about maintaining productivity though? Providing new tools and processes is all very well, but how do we keep existing ones running and how do we communicate issues to the development community at large to let them know when they need to take action?

Heads up and flag day messages

Solaris engineering has long had a notion of formal “Heads up” and “Flag Day” notifications – each a slightly different form of organisation-wide email.

In a nutshell, a “Flag day” indicated something that a developer would have to take special action to deal with. Flag days were things like compiler version bumps, new pre-putback requirements or the addition of new source code repositories to the gate.

“Heads up” messages were things with less severity, for example mentioning that a major project had landed, or a new feature had arrived, a change in how the OS could be administered, or indeed, the addition of a change to the gate which fixes a build problem that developers could encounter in the future (more on that later)

My predecessor, Artem Kachitchkine did sterling work to further formalise these notifications, adding them to an Apache Solr index, broken down by biweekly build number. He also wrote a simple WSGI front-end so that the index could be searched easily from a web page and results displayed. Along with the heads-up and flag day mails, this also indexed any source-tree integrations that included an ARC (Architecture Review Committee) case, which typically indicated a larger body of work had just been pushed.

During my time as tech lead, we maintained this tradition, as it was clearly working well and developers were used to watching out for these messages.

Getting the quantity of heads-up messages right is an art: too few mass notifications, and you’ll spend time answering individual queries that would be better spent by writing a single announcement. Too many, and people will lose track or stop reading them.  Of course, this didn’t stop the phrase “Read the damned flag day!” being thrown about from time to time, despite our best efforts!

Similarly, getting the content of the messages correct was also an art. The core team and gatekeepers would typically ask for a pre-review of any heads-up or flag day message before it was sent out. Partly this was to look for mistakes in the content, but also this was to make sure the message was concise and to the point, and to avoid having to send out subsequent corrections.

Common things we’d look for in these messages would be:

  • In the mail, please list your bug IDs and ARC cases, explain what you’ve changed in the system, what system components have been affected and why the change constitutes a flag day.
  • If only certain systems are affected or only machines with a particular piece of hardware are affected, please mention that.
  • Show what any error from not following this flag day looks like – flag days are indexed, so including excerpts of error messages makes it easier for people to search for the relevant flag day.
  • Explain how users recover from the flag day. For example, are incremental ON builds broken so that gatelings need to do a ‘make clobber’ before doing a build? Will Install of a kernel with the change continue to work on a pre-change system?
  • List bug categories for any new feature
  • Leave a contact email address.

So that explains how we’d communicate issues. How do we tell if issues are going to arise in the first place though? Enter Jenkins (again)

“Accelerated nightly builds”

I’ve mentioned before how Solaris is built into biweekly “milestone” builds, where we take packages built by every engineering group that contributes to the operating system, and produce a single image which can be installed or upgraded to.

For a long time, because of the complexity of the tools required to create such an image, this was the main way we built the OS – every two weeks, there’d be a new image and that was what got tested.

When Solaris 11 came along, with the new packaging system, and tools like Distribution Constructor, we made sure that the tools we used to build the OS were available to users too. This made it simple for engineers to produce their own entire OS images, though we found that not many actually did. Partly this was because it was just too inconvenient and time-consuming for all users to hunt down the latest packages from all engineering groups and hand-assemble their own image. Similarly, IPS allowed users to upgrade their systems from different sources at once, but again, it was inconvenient for users to configure their systems this way.

To make that easier, a group of us got talking to Solaris Release Engineering, and agreed to start assembling images nightly, with the very latest software from each engineering group – and thus “Accelerated Nightly” images came about. The terminology there is a little odd – we called it “accelerated” because Solaris RE already had “nightly” images, which they’d build for a few days leading up to their biweekly milestone build.

How does that help developer productivity? Well, in the ON consolidation, we created a Jenkins job to take these nightly images, install them on kernel zones and attempt to build our source tree. The most common source of breakage this found, was when upgrades to components in the Userland Consolidation, the engineering group which packages FOSS software for inclusion in Solaris, which the ON build had dependencies on caused the build to fail.

When we spotted breakage like this, we’d be able to quickly fix the build, integrate the fix (making sure that it also builds on non-accelerated-nightly systems) and send a heads-up mail to developers. Since most engineers would be building on machines running the most recent biweekly milestone build, this allowed us to document the potential future breakage, and the build would remain stable.

I know this doesn’t sound like rocket science, but it’s sometimes the simple things that make the most difference to a developer’s day – knowing that the build should be stable and compile cleanly all the time lets them concentrate on their changes, rather than worrying about what might go wrong the next time their build machine gets upgraded.

The accelerated nightly builds also contribute to “Running the bits you build, and building the bits you run” philosophy, and several engineers (myself included!) update their desktops daily to these bits, and enjoy playing with newly developed software without having to wait a whole two weeks before the next biweekly milestone is due to be released.

Ensuring the quality of ON deliverables

In an earlier post, I talked a little about what the ON tech lead job entails. In this post, I’m going to talk about some of the changes I made to keep raising the quality bar for the release.

Doing quality assurance for something as large as an operating system presents a few problems, similar to those for any software project, just on a larger scale:

  • writing a set of comprehensive automated tests (and having a means to analyze and baseline results)
  • ensuring those tests are maintained and executed frequently
  • knowing what tests to execute when making a change to the OS

In Solaris, we have several dedicated test teams to execute our automated tests periodically (both on nightly builds, and on biweekly milestone builds) as well as on-demand, and bugs are filed when problems occur. Each team tends to focus on a different area of the OS. We test changes from one consolidation at a time, before doing wider-area testing with changes from all consolidations built into a single OS image.

However, only finding problems after the relevant change has integrated to the source tree is often too late. It’s one thing causing a problem with an application, but an entirely different thing if your change causes the system to panic – no more testing can be done on those bits and the breakage you introduced impacts everybody.

To try to reduce the chance of that happening, we try to “build in” quality into our development processes, to make the break-fix cycle as short as possible. To get a sense of where potential gaps were, I spent time looking at all of the testing that we do for Solaris, and documented it in an wiki page, ordering it chronologically.

I won’t repeat the entire thing here, but thought it might be interesting to at least show you the headings and subheadings. Some of this refers to specific teams that perform testing, other parts simply indicate the type of testing performed. This list now contains some of the test improvements I added, and will talk about those later.

  • Before you push
    • You
    • Your Desktop/VM/cloud instances/LRT
    • The build
    • Your test teams
      • DIY-PIT and QSE-PIT
      • Selftest Performance testing
      • Project PIT runs
      • AK Testing
  • After you push
    • Incremental builds
    • Incremental boots
    • Incremental unit-tests
    • Periodic Live media boots
    • Nightly builds
    • Running the nightly bits on the build machines
    • Nightly WOS builds
    • Nightly gate tests
    • Nightly ON-PIT tests
    • Bi-weekly ON-PIT tests
  • After the WOS is built
    • Solaris RE tests
    • Solaris SST
    • SWFVT
    • ZFSSA Systems Test
    • Conformance testing
    • Performance testing
    • Jurassic, Compute Ranch build machines, shared group build machines
    • Platinum beta customers
  • Earlier releases
    • SRU testing

(A note on the terminology here:  “WOS” stands for “Wad Of Stuff” – it’s the biweekly Solaris image that’s constructed by bundling together all of the latest software from every consolidation into a single image which can be freshly installed, or upgraded to.

“PIT” stands for “Pre-Integration Test”, typically meaning testing performed on changes pushed to each consolidation’s source tree, but not yet built into a WOS image)

Running the bits you build

I’ve talked before about the ON culture of running the bits you develop, so won’t repeat myself here, except to say that the gate machine, the gate build machines, and all developers are expected to run at least biweekly, if not nightly bits. As engineers, we tend to be a curious lot, and enjoy experimenting with shiny new software – it’s amazing (and a little worrying) to discover bugs that the test suites don’t. As we find such gaps in test suites, we file bugs against them so that the test suites continually improve.

Building the bits you run

Building Solaris itself turns out to be a good stress test for the operating system, invoking thousands of processes and putting a reasonable load on the system, more so if it’s a shared build machine.

The build itself also does a significant amount of work verifying that the software it’s producing is correct: apart from the obvious tools tha run during the build, like lint and Parfait (an internal static analysis tool) there are a host of other checks that perform verification on the binaries that are produced.

Indeed, to maximise the chances of finding errors during the build, we compile the source tree with two separate compilers (currently Oracle Developer Studio and gcc) discarding the binaries produced by the “shadow compiler”. As the different compilers produce different warnings, sometimes one will report errors that the other misses, which can be an aid to developers.

The problem with catching problems early

As much as possible, we emphasise pre-integration testing to find problems early. The flip side of that, is that not every engineer has access to some of our larger hardware configurations, and test labs containing them are a finite resource.

Another problem is that even with access to those large systems, how do you know which tests ought to be executed? Since lab-time is limited and some tests can take a long time to complete, we simply can’t run all the tests before every integration because then we’d never be able to effectively make changes.

A common way for tests to be developed for Solaris, was by having a separate teams of test engineers maintain and update tests rather than developers owning their own tests (this wasn’t the rule of course – some developers modified those test suites directly)

In some cases where engineers did write their own tests, the test code was often stored in their home directories – they’d know to execute the tests the next time they were making changes to their code, but nobody else would know of their existence, and breakage could occur.

The build was also lacking any way to describe which tests ought to be executed when a given piece of software changed, and it became a question of “received wisdom” and experience to determine what testing needed to be performed for any given change.

Continuous integration in ON

For some time  (5+ years before my time as far as I can tell), the ON gate has had a simple incremental build facility. As each commit happened to the source tree, some custom code, driven by cron and procmail would select one of four previously built workspaces, pull the latest changes and kick off a build.

This typically found build-time problems quickly, so we’d be able to either backout changes that didn’t build, or get in touch with the responsible developer to arrange a fix before too many engineers were impacted, but the binaries that were produced by those incremental builds were simply discarded, which seemed like a lost opportunity to me.

Even worse, from time to time, we’d get integrations that built fine, but actually failed to boot on specific systems due to inadequate testing!

So, to see about modernizing our build infrastructure and plug this gap, I started looking into using Jenkins to not only periodically build the source tree, but also to update a series of kernel zones with those changes and make sure that we were capable of booting the resulting OS.

That was a pretty simple change, and I was pleased with how it turned out. Once that was in place for a few months, I started to wonder what else we could do with those newly booted zones?

Developer notifications and unit testing

I’ve mentioned already that in a large engineering organisation, it’s difficult to know what to test, and being dependent on a separate test group to implement your test suite can be frustrating. Of course, there can be advantages in that separation of duty – having a different pair of eyes looking at changes and writing tests can find problems that a developer would otherwise be blind to.

Given our experience with the IPS consolidation, and its use of unit tests, one of the Solaris development teams working in ON decided to take a similar route, wanting to add their tests to the ON source tree directly.

Rather than deal with the chaos of multiple teams following suit, I felt it was time to formalize how tests were added to the source tree, and to write a simple unit-test framework to allow those tests to be discovered and executed, as well as a way to advertise specific other testing and development advice that could be relevant when we detect modifications to a given source file.

Obviously there were some limits with what we could do here – some tests require specific hardware or software configurations, and so wouldn’t be appropriate for set of build-time tests, other tests are too long-running to really be considered “unit tests”.

Other tests may require elevated privileges, or may attempt to modify the test machine during execution, so it can be tricky to determine when to write a separate test suite, vs. when to enroll in the gate unit-test framework.

As part of this work, I modified our “hg pbchk” command (part of our Mercurial extension that performs basic pre-putback verification on changesets about to be integrated to the source tree, essentially ensuring the integration paperwork is correct)

The pbchk command now loads all of the test descriptor files found in the workspace, reports if tests are associated with the sources being modified, and will print specific developer notifications that ought to be emitted when a given source file changes.

I think of it as a “Robotic CRT advocate” – to point out testing that ought to run prior to integration (the CRT, or “Change Review Team” are a group of senior engineers who must pre-approve each and every putback to the ON source tree and will see the results of ‘hg pbchk’ during their review, and will verify that testing was completed)

Over time, that test framework is getting more and more use, and we now have tests that are easy to run, implemented as a simple build(1) task. Here are the ones we have today:

timf@whero[171] build test
INFO    : No config file passed: using defaults
STATE   : Starting task 'test'

Usage: build test <file name or section name>

FILE                 SECTION NAME                   SYNOPSIS
ak-divergence.cfg    ak-divergence          [ no synopsis available ]
build.cfg            build-notify           Warn when users modify AK build tools
build.cfg            build-test             A series of tests that exercise build(1)
corediag.cfg         core-diag-test         A series of tests that exercise coremond.
crypto-fips.cfg      crypto-fips-140        Crypto Framework FIPS 140-2 Boundary Change note.
daxstat.cfg          daxstat-test           A series of tests to exercise daxstat
elfsignrange.cfg     elfsignrange           Warn about grub2 duplication of elfsignrange code
fuser.cfg            fuser-test             A series of tests to exercise fuser
fwenum.cfg           fwenum-unit            Firmware Enumerator unit tests
libc.cfg             gettext                A simple test for gettext(3C)
gk.cfg               gk-test                A series of tests that exercise the gk tool
ipf2pf.cfg           ipf2pf-test            Test verifies ipf2pf is still sane
kom.cfg              kom-test               Unit tests for the KOM framework
libcmdutils.cfg      libcmdutils-test       A series of tests that exercise libcmdutils
libdax.cfg           libdax-test            A series of tests that exercise libdax
libdiskmgt.cfg       libdiskmgt-test        Test for dumping libdiskmgt cache and do inuse operation.
libkstat2.cfg        libkstat2_basic        A series of basic tests that exercise libkstat2
libkstat2.cfg        libkstat2_priv         A series of privileged tests that exercise libkstat2
libnvpair.cfg        libnvpair-test-27      libnvpair unit tests (Python 2.7)
libnvpair.cfg        libnvpair-test-34      libnvpair unit tests (Python 3.4)
libsdwarf.cfg        libsdwarf-test         A series of tests that exercise libsdwarf.
libuuid.cfg          libuuid-test           A series of tests that exercise libuuid
libv12n.cfg          libv12n-test           A series of tests that exercise libv12n.
mdb.cfg              mdb-ctf                Mdb CTF unit tests
memtype.cfg          memtype-test           A series of tests for memory types and attributes.
netcfg.cfg           netcfg-noexec          A series of tests that verify libnetcfg operation
odoc.cfg             odoc-test              A series of odoctool tests
pbchk.cfg            pbchk-test             Warn that pbchk tests must be run manually
pfiles.cfg           pfiles-test            A series of tests to exercise pfiles
rad-auth_1.cfg       rad-auth_1             Tests for RAD module authentication version 1
rad-loccsm_1.cfg     rad-loccsm_1           RAD locale test module loccsm version 1
odocprovider.cfg     rad-module-odocprovider_1-test Tests for RAD module odocprovider
rad-test.cfg         rad-test               A series of tests that exercise RAD daemon
rad-test-rbac_1.cfg  rad-test-rbac_1        Testing RBAC integration in RAD
libc.cfg             sendfile-test          sendfile unit tests for blocking socket and NULL offset
libc.cfg             sendsig                A unit test for the user signal (only for SPARC)
smf.cfg              smf                    A series of tests that exercise SMF
smf.cfg              smf-python             Tests for solaris.smf.*
smf.cfg              smf-sysconfig          The intersection of sysconfig and SMF
snoop.cfg            snoop                  Warn about PSARC 2010/429
spliceadm.cfg        spliceadm-test         A series of tests to exercise spliceadm
sstore.cfg           sstore-unit            Statistics store unit tests
sstore.cfg           sstore-unit2           Statistics store unit tests (Python 2)
libstackdiag.cfg     stackdiag-crosscheck   Feed stackdb records to libstackdiag, verify results
sstore.cfg           statcommon             Warn about statcommon
sxadm.cfg            sxadm-aslr             A series of tests that exercise aslr
sxadm.cfg            sxadm-noexec           A series of tests that exercise nx heap/nx stack
sxadm.cfg            sxadm-test             A series of tests that exercise sxadm
timespec.cfg         timespec-test          Test for timespeccmp macro
updatedrv-test.cfg   updatedrv-test         Warn about update_drv dynamic device configuration test
vboot.cfg            vboot                  Warn about grub2 duplication of verified boot cert code
verify_key2.cfg      verify_key2            Warn about duplication of verified boot development key definitions
vm2.cfg              vm2-test               Unit tests for the VM2 code
webuicoord.cfg       webuicoord-unit-2.7    WebUI Coordinator unit tests (Python 2.7)
webuiprefs.cfg       webuiprefs-unit-2.7    WebUI Preferences unit tests (Python 2.7)
zfs.cfg              zloop-test             Testing zloop, a framework for ZFS unit tests
zonecfg.cfg          zonecfg                Warn about ModifyingZonecfg wiki
zones.cfg            zones-test             Test wrappers around various zones system calls

STATE   : Finishing task 'test'
SUMMARY : 'test' completed successfully and took 0:00:00.

Of course, this is still a small fraction of the overall number of tests that run on the OS, but my hope is that we will continue to extend these unit tests over time. From my past experience as a test developer on the ZFS Test team, the easier you make tests to execute, the more likely a developer is actually going to run them!

In conjunction with the more comprehensive Jenkins pipelines we have recently finished work on, this framework has been well received, and has found problems before customers do – which continues to make me very happy.

Project Lullaby

A few years before my appointment to the tech lead job, several of us in the org had been complaining about the tools used to build the ON source tree.

During his time as ON gatekeeper, James McPherson had become frustrated about the Makefiles used in the build, which were gradually growing out of control. New Makefiles would get created by copy/pasting older ones, and as a result, errors in poorly written Makefiles would propagate across the source tree, which was obviously a problem.

The ON build at the time also used to deposit built objects within the source tree, rather than to a separate build directory.

While that was convenient for developers, it meant some weird Makefile practices were being used to allow concurrent builds on NFS-mounted workspaces (to avoid x86 and sparc from writing to the same filesystem location) and any generated sources (code generators such as lex/yacc) could accidentally be committed to the repository. So, another goal of the project was to change the build so that it wouldn’t attempt to modify the source tree.

Along with that, we had some pretty creaky shell scripts which drove the build, (primarily nightly.sh) producing a single giant log file.

Worse still, the build was completely monolithic – if one phase of the build failed, you had to restart the entire thing. Also, we were running out of alphabet for the command line flags – seeing this getopts argument made my heart sink:

+ABCcDdFfGIilMmNnOPpRrS:TtUuWwxz

Something had to be done.

So, “Project Lullaby” was formed – to put “nightly.sh” to sleep!

James and Mark Nelson started work on the Makefiles, and I set to work writing build(1)

My targets were nightly.sh, and its configuration file, often called developer.sh, along with a script to produce an interactive build environment, bldenv.sh.

We chose Python as an implementation language, deciding that while we probably could have written another set of shell scripts, a higher-level language was likely required, and that turned out to be a great decision.

This work was begun as essentially a port of nightly.sh, but as the tool progressed, we found lots of ways to make the developer experience better.

First of all, the shell scripts which defined the build environment had to go. “developer.sh” essentially just set a series of UNIX environment variables, but it didn’t do anything to attempt to clean up the existing evironment – this meant that two builds of the same source tree by different engineers could produce different results, and we ran into some nasty bugs that way.

Not being able to easily audit the contents of the developer.sh script was also bad: since the configuration file was essentially executable code, it wasn’t possible to determine what the exact build environment would be without executing it, and that meant that it was difficult to determine exactly what sort of build would be produced by a given configuration file.

I replaced developer.sh with a Python ConfigParser file, bldrc, and made build(1) responsible for generating the config file. This meant that the version of build(1) in the workspace could always produce its own config file, so we’d never have mismatched tools, where we’d build the workspace with the wrong version of the tools.

Since all bldrc files would generated the same way, it was easy to compare two files to see how the builds they would produce would differ, and easy to determine whether a build was ready to integrate (that is, all build checks had been run, and the build was clean)

Early on in the invocation of build(1) we would also verify that the build machine itself was valid: that the correct compilers were being used, that we have sufficient swap configured, etc. Of course we also have packages to pull in build-time dependencies that ought to be installed on all build machines, but a belt-and-braces approach resulted in fewer surprises  – nothing’s worse than getting a few hours into a build only to discover that we’re using the wrong compiler!

Furthermore, we made sure that we’d complain about config files with values we didn’t recognise, and also removed almost all comments from the generated file, instead implementing a build explain command to document what each variable did.

Finally, we included a build regenerate command, to allow a new version of build generate a new config file from any older one, allowing us to upgrade from older versions of the tool, without necessarily needing to version the config file format itself.

For the interactive build environment, we wrote build shell (aliased to build -i), which produced exactly the same UNIX shell environment used by the rest of the build tool (before, nightly.sh and bldenv.sh could end up using different environments!) We made sure to properly sanitize the calling environment, passing through certain harmless, but important variables such as $DISPLAY, $HOME, $PWD, $SSH_AGENT_* etc.

Having taken care of the build environment and config files, most of the rest of build(1) defined a series of build ‘tasks’ – some of which are composite tasks, so “build nightly” does “build setup”, “build install”, “build check”, etc. (this was just using the Composite design pattern)

Each build task writes its own log file, and we used Python’s logging framework to produce both plaintext and syntax-highlighted HTML log files, each with useful timestamps, and the latter with href anchors so you could easily point at specific build failures.

To avoid overloading developers, we made sure that, with few execptions, all build tasks took the same command line arguments, to reduce the amount of cognative load on developers trying to learn how to build the source tree. Instead of adding arguments for slightly different flavours of a given command, we preferred to write a new build task (of course, using class-based inheritance under the hood)

Finally, we had a few “party tricks” that we were able to add in – build tasks which didn’t produce build artifacts, but instead provided useful features that improve ON developers’ lives – for example ‘build serve’ starts a simple ephemeral HTTP server pointing to the build logs in a workspace allowing you to share logs with other engineers who might be able to fix a problem you’re seeing.

Similarly, we have a ‘build pkgserve’ task, which starts up an IPS repository server allowing you to easily install test machines over HTTP with the artifacts from your build.

“build pid” returned the process ID of the build command itself, and since all dmake(1S) invocations were run within a Solaris project(5) we were able to install a signal handler such that that stopping an entire build was as easy as:

$ kill -TERM `build pid`

Finally, we added ZFS integration, such that before each build task was executed, we’d snapshot the workspace, allowing us to quickly rollback to the previous state of the workspace and fix any problems. This turned out not to be terribly useful by the time we’d shaken the bugs out of build(1) itself, but was incredibly helpful during its development.

One more artifact that was important, was the mail notification developers get when the build completes, and we spent time improving that format so that it was easier to determine what part of the build failed, and excerpted relevant messages from the build logs so users could tell at a glance where the issue was.

Here’s a mail_msg sample:

Build summary for server i386 nightly build of /builds/ongk/workspace/nightly.build.server

Build status    : pass (RTI issues)
Started         : 2017-08-09 01:02:34
Completed       : 2017-08-09 03:37:57
Total time      : 2:35:22
Build config    : /builds/ongk/workspace/nightly.build.server/bldrc
Build version   : nightly.build.server-228:2017-08-09
Build pkgvers   : 11.4.0.0.0.3.37799

. changeset     : 5897c3a8526b (server) tip
usr/closed cs   : 61201571e908 tip
usr/man cs      : 5725f2ff08b3 tip
usr/fish cs     : [repository not present]

Start timestamp : 1502265754.6

Build machine details:
Identity        : hopper (i386)
WOS pkg         : pkg://solaris/entire@11.4,5.11-11.4.0.0.0.1.0:20170724T150254Z
core-os pkg     : pkg://nightly/system/core-os@11.4,5.11-11.4.0.0.0.2.37783:20170805T092540Z
osnet pkg       : pkg://solaris/developer/opensolaris/osnet@5.12,5.12-5.12.0.0.0.125.0:20170530T145839Z


=== Task summary ===
server-tstamp         : pass
server-check.rti_ready: fail (non-fatal)
server-clobber        : pass
server-tools          : pass
server-setup          : pass
server-install        : pass
server-packages       : pass
server-setup-nd       : pass
server-install-nd     : pass
server-packages-nd    : pass
server-generate_tpl   : pass
server-closed_tarball : pass
server-pkgmerge       : pass
server-check          : pass
server-check.protocmp : pass
server-check.elf      : pass
server-check.ctf      : pass
server-check.lint     : pass
server-check.cstyle   : pass
server-check.cores    : pass
server-check.findunref: pass
server-check.paths    : pass
server-check.pmodes   : pass
server-check.uncommitted: pass
server-check.install-noise: fail (non-fatal)
server-check.wsdiff   : pass
server-check.lint-noise: pass
server-update_parent  : pass

check.parfait         : disabled


=== Task details ===
--- server-check.rti_ready ---
(Check that this build config can be submitted for RTI)

Starting task 'server-check.rti_ready'
bldrc file: /builds/ongk/workspace/nightly.build.server/bldrc
check.parfait was disabled
One or more bldrc file settings were found that suggest this build is not ready for RTI
Finishing task 'server-check.rti_ready'
'server-check.rti_ready' failed (non-fatal) and took 0:00:07.

At the time of writing, here are all of the build(1) tasks we implemented:

timf@whero[123] build help -v
Usage: build [subcommand] [options]
       build -i [options] [commands]
       build --help [-v]

Subcommands:

all_tests           (a synonym for 'check.all_tests')
archive             Archive build products
check               Run a series of checks on the source and proto trees
check.all_tests     Run all tests known to the workspace
check.cores         Look for core files dumped by build processes
check.cstyle        Do cstyle and hdrchck across the source tree
check.ctf           Check CTF data in the non-debug proto area
check.elf           Run a series of checks on built ELF objects
check.elfsigncmp    Determines whether elfsigncmp is used to sign binaries
check.findunref     Find unused files in the source tree
check.fish          Do checks across the fish subrepo
check.install-noise Looks for noise in the install.log file
check.lint          Do a 'dmake lint' on the source tree
check.lint-noise    Looks for noise in the check.lint.log file
check.parfait       Run parfait analysis on a built workspace
check.paths         Run checkpaths(1) on a built workspace
check.pmodes        Run a pmodes check on a built workspace
check.protocmp      Run protolist and protocmp on a built workspace
check.rti_ready     Check that this build config can be submitted for RTI
check.splice        Compare splice build repositories to baseline
check.tests         Run tests for sources changed since the last build
check.uncommitted   Look for untracked files in the workspace
check.wsdiff        Run wsdiff(1) to compare this and the previous proto area
clobber             Do a workspace clobber
closed_tarball      Generates tarballs containing closed binaries
cscope              (a synonym for 'xref')
explain             Print documentation about any configuration variable
fish                Build only the fish subrepo
fish.ai_iso         Build 'nas' Fish AI iso images only
fish.conf           Verify mkak options
fish.destroy_dc     Remove datasets tagged 'onbld:dataset' at/under 'dc_dataset'
fish.gk_images      Build Fish images appropriate for gk builds
fish.images         Build 'nas' Fish upgrade images only
fish.install        Build Fish sources, writing to the fish proto area
fish.jds_ai_iso     Build 'jds' Fish AI iso images only
fish.jds_all_images Build all 'jds' Fish images
fish.jds_gk_images  Build 'jds' Fish images appropriate for gk builds
fish.jds_images     Build 'jds' Fish upgrade images only
fish.jds_txt_iso    Build 'jds' Fish text iso images only
fish.nas_all_images Build all 'nas' Fish images
fish.packages       Build Fish IPS package archives
fish.re_build       Runs AK image construction tasks for Release Engineering
fish.save_artifacts Save all build artifacts from the 'dc_dataset' directory
fish.txt_iso        Build 'nas' Fish text iso images only
generate            Produce a default bldrc configuration file
generate_tpl        Generate THIRDPARTYLICENSE files
help                Print help text about one or all subcommands
here                Runs a 'dmake  ...' in the current directory
hgpull              Do a simple hg pull for all repositories in this workspace
install             Build OS sources, writing to the proto area
kerneltar           Create a tarball of the kernel from a proto area.
nightly             Do a build, running several other subcommands
packages            Publish packages to local pkg(7) repositories
parfait_remind_db   Generate a database needed by the parfait_remind pbchk
pid                 Print the PID of the build task executing for this workspace
pkgdiff             Compare reference and resurfaced package repositories
pkgmerge            Merge packages from one or more repositories
pkgserve            Serve packages built from this workspace over HTTP
pkgsurf             Resurface package repositories
pull                Do a hg pull and report new changesets/heads
qnightly            Do a nightly build only if new hg changesets are available
regenerate          Regenerate a bldrc using args stored in a given bldrc
save_packages       Move packages to $PKGSURFARCHIVE as a pkgsurf reference
serve               Serve the contents of the log directory over HTTP
setup               Do a 'dmake setup', required for 'install' and 'here'
test                Runs tests matching test.d/*.cfg file or section names
tools               Do a 'dmake bldtools', for 'setup', 'install', and 'here'
tstamp              Update a build timestamp file
update_diag_db      Download a new copy of the stackdb diagnostic database
update_diverge_db   Generate AK/Solaris divergence database
update_parent       Update a parent ws with data/proto from this workspace
xref                Build xref databases for the workspace

I hope to talk about a few of these in more detail in future posts, but feel free to ask if you’ve any questions.

In the end, I’m quite proud of the work we did on Lullaby – the build is significantly easier to use, the results easier to understand and since the Lullaby project integrated in 2014, we’ve found it very simple to maintain and extend.

However after we integrated, I have a feeling the folks looking for a new ON tech lead decided to give me a vested-interest in continuing to work on it, and so, “Party Like a Tech Lead” began!

Party Like a Tech Lead

I haven’t written anything here for a while, though did hint about it in an earlier post. Now seems like a good time to talk about what I’ve been doing all this time.

From March 2014 – August 2017, I was the ON Technical Lead for the Solaris Marketing release, and so the “#partylikeatechlead” Twitter posts began.

We rotate the tech lead role through the Solaris engineering organisation, ideally every 18 months, in order to bring in fresh ideas and avoid burn-out, as the job can be a little grueling at times.

Initially I took on the role for what was to be the second half of Solaris 12 development, but over time, as release plans changed, the codebase I was working on became part of the plans for the continuous delivery model being adopted for Solaris and I stayed on to ease the transition.

Solaris itself is produced by multiple development teams, which we call “consolidations”, each roughly covering an area of the operating system. ON, or “OS/Net” – operating system and networking is one of the main consolidations, and comprises the core kernel, drivers and userspace libraries and commands. Many of the technologies you traditionally associate with Solaris are delivered by ON (ZFS, DTrace, SMF, etc.)

So what does the ON tech lead actually do? Well, each engineer that’s appointed to the job brings their own focus to it, but generally, the tech lead is supposed to:

  • ensure the quality of the ON deliverables
  • maintain and improve developer productivity
  • arbitrate technical disputes
  • review projects and bugfixes for integration readiness

Those are fairly broad goals and exactly how each of those are implemented is left to each technical lead to decide, but I hope to write a post about each one to explain my approach.

However, there’s a higher level here – apart from the specific deliverables of the job, the ON Tech Lead acts as an example for other engineers in the organisation. They set a high-bar for quality, are patient and understanding with new engineers, and strict with those who should know better.

Ultimately, the aim is to produce an excellent operating system, but while doing so, attempt to improve the the world for all Solaris engineers and users, so that we can all be more efficient in our development work and indirectly help our customers. Indeed, the ON core team, comprising the Tech Lead, the Lead Gatekeeper and the C-team lead collectively own the ON build environment and tools, and are strongly encouraged to make improvements to them.

There’s a few mantras in Solaris engineering that have existed since long before Oracle acquired Sun Microsystems – “FCS (first customer ship) quality all the time“, and “Find the bugs before customers do” are two of my favourites, and almost everything I’ve done over the last three and a quarter years was guided by those words.

In the next post, I’m going to talk a little more about how I got appointed to the role, via my work on “Project Lullaby“, and in subsequent posts, hope to talk more about exactly what I did during my tenure to advance Solaris.

In memoriam: Roger A. Faulkner

As current ON12 tech lead and a gatekeeper, I just got to push this changeset to the Solaris ON12 source tree.

I only had the privilege of meeting Roger a few times, but interacted with him over email at various points in my career at Sun and Oracle. He was an incredible engineer and an inspiration to us all – I’ll miss him and hope this is in some way a fitting tribute.

$ hg paths default
ssh://on12.us.oracle.com//export/on12-clone
$ hg log -r tip | grep -v user
changeset:   [public] 31766:9db5360cc19c
tag:         tip
date:        Wed Jul 06 09:52:46 2016 +0100
description:
        In memoriam: Roger A. Faulkner

Syntax highlighted OS/Net-form ‘hg log’

Mercurial has a color extension that I hoped might let us pretty-print bugids in hg log output, but never got around to trying to make it happen.

Well, a while ago, I burnt an afternoon to get it working. This is (really) ugly, but does the trick. Just add the following to your ~/.hgstyle  and update your ~/.hgstyle file:

# This syntax-highlights ON-format commit messages, writing
# the bugids in a lovely shade of blue. To use, add the
# following to your ~/.hgrc
# [ui]
# hgstyle = ~timf/.hgstyle
#
# [extensions]
# color =
# pager =
#

# My hacks for 'changeset' are horrendous. startswith can't
# take a regexp, but sub can, so to detect bugids, we replace
# a regexp with a string, then search for that string.
# I'm sorry.

# We see if we can find a bug id in the first word of the
# line. If we do, we color it blue and emit it,
# otherwise we emit nothing.
# Then, for printing the synopsis, we check (again)
# for a bugid and if we find one, remove it from the line and
# emit the rest of the line, otherwise we emit the whole line.

# While this is really really ugly, it protects us from
# a problem when printing the synopsis where if we tried
# doing:
#
#     sub(word('0', line), line)
#
# we would blow up if word 0 in a synopsis line is an invalid
# regular expression.
# (which actually happens in changeset 67b47fad41d4 in the
# IPS gate)"

#
# Developer note: Mercurial templating functions are weird. In
# particular, if-statements take the form
#   if(expression, action, else-action)
#
# See https://www.selenic.com/hg/help/templates
#

changeset = 'changeset:   {label("log.changeset changeset.{phase}", "[{phase}]")} \
        {label("red", "{rev}:{node|short}")} {branches}\n\
        {tags}{parents}user:        {author}\n\
        date:        {date|date}\ndescription:
        \t{splitlines(desc) % "{if(startswith('BUGID',
                                              sub('[0-9]+', 'BUGID', word('0', line))),
                                   label('blue', word('0', line)))
                               }{if(startswith('BUGID_FOR_SYNOPSIS',
                                              sub('[0-9]+', 'BUGID_FOR_SYNOPSIS', word('0', line))),
                                   sub(word('0', line), '', line),
                                   line)
                               }\n"|tabindent}\n\n'

Which results in hg log output like this:

Syntax-highlighted hg log output

I hope you find this useful! (comments on better implementations are welcome)

Updated 13th March 2016: I needed to make a few changes for Mercurial 3.4.1 which didn’t like the previous version, and have include those chages in the text above

IPS changes in Solaris 11.2

We’ve just released Oracle Solaris 11.2 beta, and with it comes a considerable number of improvements in the packaging system, both for Solaris administrators and for developers who publish packages for Solaris.

Other than general bug fixes and performance improvements, I thought a few changes would be worth mentioning in a bit more detail, so here goes!

Admin changes

One of the focuses we had for this release was to simplify common administrative tasks in the packaging system, particularly for package repository management. Most of the changes in this section reflect that goal.

Mirror support

We’ve now made it extremely easy to create local mirrors of package repositories.

The following command will create a new repository in a new ZFS dataset in /var/share/pkg/repositories and will create a cron job which will periodically do a pkgrecv from all publishers configured on the system, keeping the local mirror up to date:

# svcadm enable pkg/mirror

If that’s too much content, the mirror service uses the notion of a “reference image” in which you can configure the origins which should be mirrored (“/” is the default reference image). All SSL keys/certs are obtained by the service from the properties on the reference image.

So to mirror just the official release repository, you could do:

# pkg image-create /export/ref-image-release
# pkg -R /export/ref-image-release set-publisher -p http://pkg.oracle.com/solaris/release
# svccfg -s pkg/mirror:default setprop config/ref_image = /export/ref-image-release
# svccfg -s pkg/mirror:default refresh
# svcadm enable pkg/mirror

Of course, if you want to maintain several local repositories, each mirroring a different repository with separate mirror-update schedules, you can easily create a new instance of the pkg/mirror service to do that.

More settings are available in the config property group in the SMF service, and they should be self-explanatory.

pkgrecv –clone

The mirror service mentioned above is an additive mirror of one or more origins, receiving into a single pkg(5) repository from one or more upstream repositories.

For better performance we have also included a very fast way to copy a single repository. The --clone operation for pkgrecv(1) gives you an exact copy of a repository, optionally limiting the pkgrecv to specific publishers.

Scalable repository server

In the past, when serving repositories over HTTP where there was a high expected load, we’ve recommended using an Apache front-end, and reverse-proxying to several pkg.depotd(1M) processes using a load balancer.

We felt that this was a rather involved setup just to get a performant repository server, so for this release we’re introducing a new repository server which serves pkg(5) content directly from Apache.

Here’s what that looks like:

Screenshot-ips repository server

Here, you can see a single pkg/depot, with associated httpd.worker processes, along with a series of pkg/server instances which correspond to the screenshot above:

#  svcs -p pkg/depot pkg/server
STATE          STIME    FMRI
online         18:44:21 svc:/application/pkg/depot:default
               18:44:20    713363 httpd.worker
               18:45:29    713595 httpd.worker
               18:45:29    713596 httpd.worker
               18:45:29    713597 httpd.worker
               18:47:54    713605 httpd.worker
               20:36:36    836614 httpd.worker
online         18:44:24 svc:/application/pkg/server:on12-extra
online         18:45:13 svc:/application/pkg/server:on12-nightly
online         18:45:25 svc:/application/pkg/server:pkg-gate

You can see that we only have processes associated with the pkg/depot service: the pkg/server instances here have properties set to say that they should not run a pkg.depotd(1M) instance, but instead should only be used for configuration of the pkg/depot server.

We can mix and match pkg/server instances which are associated with the pkg/depot service and instances which have their own pkg.depotd(1M) process.

The new pkg/depot service does not allow write access or publication, but otherwise responds to pkgrepo(1) commands as you would expect:

$ pkgrepo -s http://kura/pkg-gate info
PUBLISHER        PACKAGES STATUS           UPDATED
pkg5-nightly     12       online           2014-04-28T02:11:36.909420Z
pkg5-localizable 1        online           2014-04-28T02:11:40.244604Z

pkgrepo verify, fix

These were actually included in an S11.1 SRU, but they’re worth repeating here. We now have pkgrepo(1) subcommands to allow an administrator to verify and fix a repository, checking that all packages and files in the repository are valid, looking at repository permissions, verifying both package metadata and the files delivered by the package.

pkgrepo contents

You can now query the contents of a given package using the pkgrepo command (previously, you had to have a pkg(5) image handy in order to use “pkg contents”)

pkgrecv -m all-timestamps is the default

For most commands, you’d expect the most-commonly used operation to be the default. Well, for pkgrecv, when specifying a package name without the timestamp portion of the FMRI, we’ll now receive all packages matching that name, rather than just the latest one – which is what most of our users want by default. There are other -m arguments that allow you to change the way packages are matched, allowing you to choose the old behaviour.

SSL support for pkgrepo, pkgrecv, pkgsend

It’s now possible to specify keys and certificates when communicating with HTTPS repositories for these commands.

pkgsurf

pkgsurf(1) is a tool that implements a something we’d always wanted: a way to streamline our publication processes.

When publishing new builds of our software, we’d typically publish all packages for every build, even if the packaged content hadn’t changed, resulting in a lot of packaging bloat in the repository.

The repository itself was always efficient when dealing with package contents, since files are stored by hash in the repository. However, with each publication cycle, we’d get more package versions accumulating in the repository, with each new package referencing the same content. This would inflate package catalogs, and cause clients to do more work during updates, as they’d need to download the new package metadata each time.

pkgsurf(1) allows us to compare the packages we’ve just published with the packages in a reference repository, replacing any packages that have not changed with the original package metadata. The upshot of this is a greatly reduced number of packages accumulating in, say, a nightly build repository, resulting in less work for clients to do when systems are updated where no actual package content has changed between builds.

This is really more of a package developer change, rather than a package administrative change, but it’s in this section because having fewer package versions to deal with makes administrators happy.

pkg exact-install

This is a fast way to bring a system back to a known state, installing all packages supplied on the command line (and their dependencies) and removing all other packages from the system. This command can be very helpful when trying to bring a system back into compliance with a set of allowed packages.

While the operation itself is fairly straightforward, coming up with a name for it was complex, and we spent quite some time trying to decide on a name! It turned out “exact-install”, the original suggestion, was the most descriptive. The old computer science adage of “There are only two hard things in Computer Science: cache invalidation and naming things.”[1] remains safely intact.

–ignore-missing

Several pkg(1) subcommands now take a --ignore-missing argument, which prevents pkg(1) from reporting an error and returning when one of the packages presented on the command line wasn’t present in the image being operated upon.

Zones changes

The packaging system in Solaris has always been well-integrated with Solaris Zones, and with 11.2, we’ve improved that integration.

recursive linked-images operations

A common operation on systems with zones is to install or update a package in the global zone and all attached non-global zones. While pkg(1) has always ensured that packages in zones and non-global zones have always been compatible, apart from “pkg update” (with no arguments) most package operations would only apply on the global zone unless parent/child dependencies were specified on the package being installed or updated.

With Solaris 11.2, we now have a flag, -r, that can be used with pkg install, pkg uninstall and pkg update that will recurse into the zones on the system to perform that same packaging operation. The -z and -Z options can be supplied to select specific zones into which we should recurse, or exclude certain zones from being operated upon.

Actuators run for booted NGZ operations

This is really a side-effect of the work mentioned in the previous paragraph, but it bears repeating: actuators now fire in non-global zones as a result of package operations initiated in the global zone which needed to also operate in non-global zones.

Synchronous actuators

This applies only to global zones in this release (and non-global zones if you issued the pkg operation from within the zone, not recursive operations initiated from the global zone), but since we’ve just talked about actuators, now seems like a good time to mention it.

There are now --sync-actuators and --sync-actuators-timeout arguments for several pkg(1) subcommands that cause us to wait until all actuators have fired before returning, or to wait a specified amount of time before returning. That way, you can be sure that any self-assembly operations have completed before the pkg(1) client returns.

Kernel zones

Mike has written more about about kernel zones but I thought I’d make a small note about them with respect to the packaging system.

While the packaging system is well-integrated with traditional zones, it’s intentionally not integrated with kernel zones. That is, other than the initial installation of a kernel zone, there are no IPS interactions between a kernel zone and the global zone in which it’s hosted. The kernel zone is a separate IPS image, potentially running a different version of the operating system than the global zone.

Misc changes

system attributes support

The packaging system now has support for delivering files with system attributes (those visible using ls -/ c). See pkg(5) and chmod(1) for more details.

Multiple hash algorithm support

This is really a behind-the-scenes change, and for 11.2 it has no visible effects, but since I spent quite a while working on it, I thought it was worth mentioning :-) So far, the packaging system has used SHA-1 for all hashes calculated on package content and metadata throughout its codebase. We recognized that we’d want to support additional hash algorithms in the future, but at the same time ensure that old clients were compatible with packages published using algorithms other than SHA-1.

With this work, we revisited the use of SHA-1 in pkg(5) and made sure that the hash algorithm could be easily changed in the future, and that older clients using packages published with multiple hash algorithms would automatically choose the most favorable algorithm when verifying that package.

There’s work ahead to allow the publication of packages with more than one hash algorithm, but we’ve laid the foundations now for that work to happen.

To close

That’s been a quick roundup of the changes that we have in IPS in 11.2. I hope you’ve found it interesting.

On a personal note, I’ve had a lot of fun working on some of these features (I didn’t work on all of them). Of late I’ve spent most of my time working on the OS/Net build system, and have a new role helping that consolidation along towards its next major release (“major” in a similar sense to “major motion picture”, not “SunOS 6.0” :-) so I won’t have as much time to spend on IPS for a while. I’ll try to dip my toe in, from time to time though!


[1] (and off-by-one errors) via http://martinfowler.com/bliki/TwoHardThings.html.

ZFS auto-snapshots: a new home for 0.12

Quick post here, to mention that if you still use the old (non Python-based) zfs-auto-snapshot SMF service, since mediacast.sun.com went away, and hg.opensolaris.org is no more, there’s not really anywhere for this service to live.

While this code was never intended to be any sort of enterprise-level backup facility, I still use this on my own systems at home, and it continues to work away happily.

I’ve uploaded a copy of the latest repository here, version 0.12, including a recent patch submitted by Jim Klimov, along with some Makefile changes to build an IPS package as part of the build.

For the IPS package, I’ve moved the service manifest to /lib/svc/manifest but left it in /var/svc/manifest for the SVR4 package.

Enjoy!

IPS changes in Solaris 11.1, brought to you by the letter ‘P’

Original image by veggiefrog, on Flickr

I thought it might be a good idea to put together a post about some of the IPS changes that appear in Solaris 11.1. To make it more of a challenge, everything I’m going to talk about here, begins with the letter ‘P‘.

Performance

We’ve made great progress in speeding up IPS. I think performance bugs tend to come in a few different flavours: difficult to solve or subtle bugs, huge and obvious ones, bugs that can be solved by doing tasks in parallel and bugs that are really all about the perception of performance, rather than actual performance. We’ve come across at least one of each of those flavours during the course of our work on 11.1.

Shawn and Brock spent time digging into general packaging performance, carefully analyzing the existing code and testing changes to improve performance and reduce memory usage. Ultimately, their combined efforts resulted in a 30% boost to pkg(5) performance across the board, which I think was pretty impressive.

Other performance bugs were much easier to spot and fix. For example, 'pkg history' performance on systems with lots of boot environments was attrocious: my laptop with 1796 pkg history entries was taking 3 minutes to run 'pkg history' with S11 IPS bits, and after the fix, the command runs in 11 seconds, another good performance improvement, albeit one of lesser significance.

I’ll mention some other performance fixes in the next two sections.

Parallel zones

Apart from trying to perform operations more quickly, a typical way to address performance problems is to make the system faster by doing things in parallel. In this case, in the previous release, 'pkg update' in a global zone that contains many non-global zones was quite slow because we worked on one zone at a time. For S11.1, Ed did some excellent work to add the ‘-C‘ flag to several pkg(1) subcommands, allowing multiple zones to be updated at once.

Ed’s work wasn’t simply just to perform multiple operations in parallel, but also to improve what was being done along the way – it was a lot of change, and it was well worth it.

With the work we’ve done in the past on the system-repository, these parallel updates are network-efficient, with caching of packaged content for zones being provided by the system repository.

Progress-tracking

Sometimes you can make a system appear faster by making the user interface provide more feedback on what is being performed. Dan added some wonderful new progress tracking code to all of the pkg(5) tools, changing the tools to use that API.

So, if the older "Planning /-|-\ " spinner was frustrating you, then you’ll definitely enjoy the changes here. It’s hard to show an example of the curses-terminal-twiddling in this blog post, so here’s what you’d see when piping the output (the progress tracking code can tell when it’s talking to a terminal, and it adjusts the output accordingly):

root@kakariki:~# pkg install squid | tee
 Startup: Refreshing catalog 'solaris' ... Done
 Startup: Caching catalogs ... Done
Planning: Solver setup ... Done
Planning: Running solver ... Done
Planning: Finding local manifests ... Done
Planning: Fetching manifests: 0/1  0% complete
Planning: Fetching manifests: 1/1  100% complete
Planning: Package planning ... Done
Planning: Merging actions ... Done
Planning: Checking for conflicting actions ... Done
Planning: Consolidating action changes ... Done
Planning: Evaluating mediators ... Done
Planning: Planning completed in 10.32 seconds
           Packages to install:  1
       Create boot environment: No
Create backup boot environment: No
            Services to change:  1

Download:    0/1715 items  0.0/8.8MB  0% complete 
Download:  230/1715 items  0.3/8.8MB  3% complete (59.9k/s)
Download:  505/1715 items  0.5/8.8MB  6% complete (55.6k/s)
.
. [ ed. I removed a few lines here ]
.
Download: 1417/1715 items  8.3/8.8MB  94% complete (140k/s)
Download: 1653/1715 items  8.7/8.8MB  99% complete (85.2k/s)
Download: Completed 8.78 MB in 76.46 seconds (117k/s)
 Actions:    1/1903 actions (Installing new actions)
 Actions: Completed 1903 actions in 3.23 seconds.
Finalize: Updating package state database ...  Done
Finalize: Updating image state ...  Done
Finalize: Creating fast lookup database ...  Done

Proxy configuration

I suppose this could also be seen as a performance bug (though the link is tenuous, I admit)

Behind the scenes, pkg(5) tools use libcurl to provide HTTP and HTTPS transport facilities, and we inherit the support that libcurl provides for web proxies. Typically a user would set a $http_proxy environment variable before running their IPS command.

At home, I run a custom web-proxy, through which I update all of my Solaris development machines (most of my systems reside in NZ, but many of my repositories are in California, so using a local caching proxy is a big performance win for me)

Now, I could use pkgrecv(1) to pull updates to a local repository every build, and while this is great for users who want to maintain a “golden master” repository, it’s not an ideal solution for a user like me who updates their systems every two weeks: the upstream repository tends to have a bunch of packages that I will never care about, I’m unlikely to ever need to worry about sparc binaries at home, and I’m never sure which packages I’ll want to install, so I prefer the idea of a transparent repository cache, than having to populate and maintain a complete local repository.

Unfortunately, quite often I’d find myself forgetting to set $http_proxy before running ‘pkg update‘, and I’d end up using more bandwidth than I needed to, and when using repositories that were only accessible with different proxies, things tended to get a bit messy.

So, to scratch that itch, we came up with the "--proxy" argument to "pkg set-publisher", which allows us to associate proxies with origins on your system. The support is provided at the individual origin level, so you can use different proxies for different URLs (handy if you have some publishers that live on the internet, and others that live on your intranet)

To make things easier for zones administrators, the system-repository inherits that configuration automatically, so there’s no need to set the ‘config/http_proxy‘ option in the SMF service anymore (however, if you do set it, the service will use that value to override all --proxy settings on individual origins)

As part of this work, we also changed the output of "pkg publisher", removing those slightly confusing "proxy://http://foobar" URIs. Now, in a non-global zone, we show something like this:

root@kakariki:~# pkg publisher
PUBLISHER                   TYPE     STATUS P LOCATION
solaris        (syspub)     origin   online T <system-repository>
solaris        (syspub)     origin   online F <system-repository>
solaris        (syspub)     origin   online F http://localhost:8080/

This particular zone is one that’s running on a system which has a HTTP origin and a file-based origin in the global zone, and a HTTP origin that has been manually added to the nonglobal zone. The “P” column indicates whether a proxy is being used for each origin (“T” standing for “true”, indicating HTTP access going through the system repository, and “F” standing for “false”, showing the file-based publisher being served directly from the system-repository itself, as well as the zone-specific repository running on port 8080 in that zone)

We print more details about the configuration using the "pkg publisher <publisher>" command:

root@kakariki:~# pkg publisher solaris

            Publisher: solaris
                Alias: 
           Origin URI: http://ipkg.us.oracle.com/solaris11/dev/
                Proxy: http://localhost:1008
              SSL Key: None
             SSL Cert: None
           Origin URI: http://localhost:1008/solaris/db0fe5e2fa5d0cabc58d864c318fdd112587cd89/
              SSL Key: None
             SSL Cert: None
           Origin URI: http://localhost:8080/
              SSL Key: None
             SSL Cert: None
          Client UUID: 69d11a50-0e13-11e2-bc49-0208209b8274
      Catalog Updated: October  4, 2012 10:56:53 AM 
              Enabled: Yes

P5p archive support and zones

This isn’t related to performance (unless you count a completely missing piece of functionality as being a particularly severe form of performance bug!) When implementing the system-repository for S11, we ran out of runway and had to impose a restriction on the use of “p5p” archives when the system had zones configured. This work lifts those restrictions.

The job of the system-repository is to allow the zone to access all of the pkg(5) repositories that are configured in the global zone, and to ensure that any changes in the publisher configuration in the global zone are reflected in every non-global zone automatically.

To do this, it uses a basic caching proxy for HTTP and HTTPS-based publishers, and a series of Apache RewriteRule directives to provide access to the file-based repositories configured in the global zone.

P5p files were more problematic: these are essentially archives of pkg(5) repositories that can be configured directly using ‘pkg set-publisher‘. The problem was, that no amount of clever RewriteRules would be able to crack open a p5p archive, and serve its contents the the non-global zone.

We considered a few different options on how to provide this support, but ended up with a solution that uses mod_wsgi (which is now in Solaris, as a result) to serve the contents directly. See /etc/pkg/sysrepo/sysrepo_p5p.py if you’re interested in how that works, but there’s no administrator interaction needed when using p5p archives, everything is taken care of by the system-repository service itself.

Pruning and general care-taking

According to hg(1), we’ve made 209 putbacks containing 276 bug fixes and RFEs to the pkg-gate since S11. So aside from all of the performance and feature work mentioned here, Solaris 11.1 comes with a lot of other IPS improvements – definitely a good reason to update to this release.

If you’re running on an Illumos-based distribution and you don’t have these bits in your distribution, I think now would be an excellent time to sync your hg repositories and pull these new changes. Feel free to ping us on #pkg5 on irc.freenode.net if you’ve any questions about porting, or anything else really – we’re a friendly bunch.

You can see a list of the changes we’ve made at

http://src.opensolaris.org/source/history/pkg/gate/

Per-BE /var subdirectories (/var/share)

OK, that’s a slightly contrived name for this feature (only used here so it could begin with ‘P’) We’ve been calling this “separate /var/share” while it was under development.

Technically, this isn’t an IPS change, it’s a change in the way we package the operating system, but it’s a concrete example of one of the items in the IPS developer guide on how to migrate data across directories during package operations using the ‘salvage-from‘ attribute for ‘dir‘ actions.

This change moves several directories previously delivered under /var onto a new dataset, rpool/VARSHARE, allowing boot environments to carry less baggage around as part of each BE clone, sharing data where that makes sense. Bart came up with the mechanism and prototype to perform the migration of data that should be shared, and I finished it off and managed the putback.

For this release, the following directories are shared:

  • /var/audit
  • /var/cores
  • /var/crash (previously unpackaged!)
  • /var/mail
  • /var/nfs
  • /var/statmon

Have a look at /lib/svc/method/fs-minimal to see how this migration was performed. Here’s what pkg:/system/core-os looks like when delivering actions that salvage content:

$ pkg contents -H -o action.raw -a 'path=var/.migrate*' core-os | pkgfmt
dir  path=var/.migrate owner=root group=sys mode=0755
dir  path=var/.migrate/audit owner=root group=root mode=0755 reboot-needed=true \
    salvage-from=/var/audit
dir  path=var/.migrate/cores owner=root group=sys mode=0755 reboot-needed=true \
    salvage-from=/var/cores
dir  path=var/.migrate/crash owner=root group=sys mode=0700 reboot-needed=true \
    salvage-from=/var/crash
dir  path=var/.migrate/mail owner=root group=mail mode=1777 reboot-needed=true \
    salvage-from=/var/mail
$ pkg contents -H -o action.raw -a 'target=../var/share/*' core-os | pkgfmt
link path=var/audit target=../var/share/audit
link path=var/cores target=../var/share/cores
link path=var/crash target=../var/share/crash
link path=var/mail target=../var/share/mail

As part of this work, we also wrote a new section 5 man page, datasets(5) which is well worth reading. It describes the default ZFS datasets that are created during installation, and explains how they interact with system utilities such as swap(1M), beadm(1M), useradd(1M), etc.

Putting the dev guide on docs.oracle.com

Finally, it’s worth talking a bit about the devguide. We wrote the IPS Developer Guide in time for the initial release of Solaris 11, but didn’t quite make the deadline for the official docs.oracle.com documentation release, leading us to publish it ourselves on OTN and opensolaris.org. Since then, we’ve had a complaints about the perceived lack of developer documentation for IPS, which was unfortunate.

So, for Solaris 11.1, Alta has converted the guide into Docbook, and done some cleanup on the text (the content is largely the same) and it will be available on docs.oracle.com in all its monochrome glory.


I think that’s all of the Solaris 11.1 improvements I’ll talk about for now – if you’ve questions on any of these, feel free to add comments below, mail us on pkg-discuss or pop in to #pkg5 to say hello. I’ll update this post with links to the official Solaris 11.1 documentation once it becomes available.

Mini Jurassic – my home server

At Sun, and now Oracle, we have a server called Jurassic. The machine was previously hosted in MPK17, the office in Melo Park, CA where many of the Solaris kernel developers worked (now occupied by a bunch of kids improving the lot of humanity through careful application of advertising technology) It now lives in the Santa Clara offices, where many of us moved to.

Every two weeks, jurassic is updated to the latest development builds of Solaris. Less frequently, it gets a forklift upgrade to more recent hardware to improve test coverage on that platform. The “Developing Solaris” document has this to say about jurassic:

You should assume that once you putback your change, the rest of the world will be running your code in production. More specifically, if you happen to work in MPK17, within three weeks of putback, your change will be running on the building server that everyone in MPK17 depends on. Should your change cause an outage during the middle of the day, some 750 people will be out of commission for the order of an hour. Conservatively, every such outage costs Sun $30,000 in lost time [ed. note from timf: I strongly suspect this is lower now: newer jurassic hardware along with massive improvements in Solaris boot time, along with bootable ZFS means that we can reboot jurassic with the last stable Solaris bits very quickly and easily nowadays, though that’s not an excuse to putback a changeset that causes jurassic to tip over] — and depending on the exact nature of who needed their file system, calendar or mail and for what exactly, it could cost much, much more.

If this costs us so much, why do we do it? In short, to avoid the Quality Death Spiral. The Quality Death Spiral is much more expensive than a handful of jurassic outages — so it’s worth the risk. But you must do your part by delivering FCS quality all the time.

Does this mean that you should contemplate ritual suicide if you introduce a serious bug? Of course not — everyone who has made enough modifications to delicate, critical subsystems has introduced a change that has induced expensive downtime somewhere. We know that this will be so because writing system software is just so damned tricky and hard. Indeed, it is because of this truism that you must demand of yourself that you not integrate a change until you are out of ideas of how to test it. Because you will one day introduce a bug of such subtlety that it will seem that no one could have caught it.

And what do you do when that awful, black day arrives? Here’s a quick coping manual from those of us who have been there:

  • Don’t pretend it didn’t happen — you screwed up, but your mother still loves you (unless, of course, her home directory is on jurassic)
  • Don’t minimize the problem, shrug it off or otherwise make light of it — this is serious business, and your coworkers take it seriously
  • If someone spent time debugging your bug, thank them
  • If someone was inconvenienced by your bug, apologize to them
  • Take responsibility for your bug — don’t bother to blame other subsystems, the inherent complexity of Solaris, your code reviewers, your testers, PIT, etc.
  • If it was caught internally, be thankful that a customer didn’t see it [ed. note from timf: emphasis mine – this is the most important bit for me]

But most importantly, you must ask yourself: what could I have done differently? If you honestly don’t know, ask a fellow engineer to help you. We’ve all been there, and we want to make sure that you are able to learn from it. Once you have an answer, take solace in it; no matter how bad you feel for having introduced a problem, you can know that the experience has improved you as an engineer — and that’s the most anyone can ask for.

So, naturally, my home directory in CA is on jurassic, and whenever I’m using lab machines in California, I too am subject to whatever bits are running on jurassic.

However, I don’t live in California – I work remotely from New Zealand, and as good as NFSv4 is, I don’t fancy accessing all my content over the Pacific link.

I strongly believe in the sentiment expressed in the Developing Solaris document though, so my solution is to run a “mini-jurassic” at home, a solution I expect most other remote Solaris developers use.

My home server was previously my desktop machine – a little 1.6ghz Atom 330 box that I wrote about a while ago. Since Oracle took over, I now run a much more capable workstation with a Xeon E31270 @ 3.40GHz, a few disks and a lot more ram :) Despite the fact the workstation also runs bits from bi-weekly builds of Solaris, it doesn’t do enough to even vaguely stress the hardware, so when I got it at the beginning of the year, I repurposed my old Atom box as a mini-jurassic.

Here are the services I’ve got running at the moment:

ZFS

… well, obviously. The box is pretty limited in that it’s maxed out at 4gb RAM, and non-ECC ram at that (I know – I’ll definitely be looking for an ECC-capable board next time, though I haven’t looked to see if there are any mini-ITX, low-power boards out at the moment)

With only three disks available, I use a single disk for the bootable root pool and a pair of disks, mirrored, for the main data pool. I periodically use ZFS to send/recv important datasets from the mirror to other machines on my home network. I suspect whenever I next upgrade the system, I’ll buy more disks and use a 3-way mirror: space hasn’t been a problem yet, the main data-pool is just using 1.5TB disks, and I’m only at 24% capacity.

Certain datasets containing source code also have ZFS encryption enabled, though most of the code I work on resides on non-encrypted storage (because it’s all still open source and freely available)

I run the old zfs-auto-snapshot service on the system so that I always have access to daily, hourly, and every-15-minute snapshots of the datasets I really care about.

NFS
I serve my home directory from here, which automounts onto my laptop and workstation. It also shares to my mac. Whenever I have to travel, I use ZFS to send/receive all of the datasets that make up my home directory over to my laptop, then send them back when I return.
CIFS
The windows laptop mounts its guest Z: drive via the CIFS server sharing a single dataset from the data pool (with a quota on that dataset, just in case) This is also shared to my mac.
An Immutable Zone
Immutable zones are a new feature in Solaris 11. I have a very stripped-down zone, which is internet-facing, running FeedPlus, a simple cron-job that runs a Python script and a minimal web-server. The zone has resource-controls set to give it only 256mb of ram to prevent it from taking over the world. I really ought to configure Crossbow to limit bandwidth as well.
A read/write zone

The standard flavour of zones have been around for a while now. This runs the web server for the house, sharing music and video content. All of the content actually resides in the global-zone, but is shared into a zone using ZFS clones of the main datasets, which means that even if someone goes postal in the zone, all of my data is safe.

The zone also runs my IRC logger for #pkg5 on Freenode (helpful when you work in a different timezone)

IPS updates

The system gets upgraded every two weeks, creating a new boot environment both for the zones as well as the global zone. It updates through a caching HTTP proxy which runs on my workstation, helping to further minimise bandwidth when I update all of my local machines once new bits become available (though IPS is already pretty good at keeping bandwidth to a minimum, only downloading the files that change during each update)

I tend to run several other stable and experimental bits and pieces on my home systems, both on the little Atom box, as well as my workstation. These mostly relate to my day-job improving IPS in Solaris, and those have already proved to be worth their weight, both in terms of shaking bugs out, as well as making my life a lot easier as a remote worker. I hope to write more about some of those in a future post sometime.

As more capabilities get added to Solaris, as with the jurassic server in California, I try as much as I can to find ways to exercise those new bits, because as it says on the jurassic web page:

Every problem we find and fix here is a problem which a customer will not see.

and that’s a good thing.

FeedPlus – converting G+ feeds to Atom, and now Twitter

Original image by hapinachu on Flickr

I’ve added a feature to FeedPlus, the command line G+ to Atom converter I mentioned previously. It now has option to also post to your Twitter stream directly, rather than going via Twitterfeed.

I found that, despite Twitterfeed claiming a 30 minute interval between checking the RSS/Atom URL it was configured to repost to Twitter, updates weren’t being pulled that frequently and often a G+ update I’d write that was relatively timely would finally appear several hours later. I decided it’d be better to work out how to post directly to Twitter.

For webapps, this didn’t appear to be too complex, but it’s a bit more involved for CLI applications (OAuth, yuck!) python-twitter helps a lot though, and it seems to be working so far. For now, I’ve got this running in a cron job that fires every 5 minutes, and while it’s definitely not as network-efficient as it could be, it’s doing the job just fine.

As before, I’m really hoping Google will come along with proper G+ to Twitter bridging (and in that direction: yes it’s lossy, but 140 characters just seems so restricting these days!) I’m also hoping that more of the folks I follow on Twitter will ditch both Facebook and Twitter, and give G+ another go – we’ll see.

Comments and bug reports welcome over on the FeedPlus Github page.

The IPS System Repository

Original image by nori_n

I’m excited about today’s launch of Solaris 11 – I’ve been contributing to Solaris for quite a while now, pretty much since 1996, but my involvement in S11 has been the most fun I’ve had in all releases so far.

I’ve talked before about some of the work I’ve done on IPS over the last two years – pkg history, pkgdepend (and here), pkglint and pkgsend and most recently, helping to put together the IPS Developer Guide.

Today, I’m going to talk about the system repository and how I helped.

How zones differ from earlier releases

Zones that use IPS are different than those in Solaris 10, in that they are always full-root: every zone contains its own local copy of each package, they don’t inherit packaged content from the global zone as "sparse" zones did in Solaris 10.

This simplifies a lot of zone-related functionality: for the most part, administrators can treat a zone as if it were a full Solaris instance, albeit a very small one. By default new zones in S11 are tiny. However, packaging with zones is a little more complex, and the system aims to hide that complexity
from users.

Some packages in the zone always need to be kept in sync with those packages in the global zone. For example, anything which delivers a kernel module and a userland application that interfaces with it must be kept in sync between the global zone and any non-global zones on the system.

In earlier OpenSolaris releases, after each global-zone update, each non-global zone had to be updated by hand, attaching and detaching each zone. During that detach/attach the ipkg brand scripts determined which packages were now in the global zone, and updated the non-global zone accordingly.

In addition, in OpenSolaris, the packaging system itself didn’t have any way of ensuring that every publisher in the global zone was also available in the non-global zone, making updates difficult if switching publishers.

Zones in Solaris 11

In Solaris 11, zones are now first-class citizens of the packaging system. Each zone is installed as a linked image, connected to the parent image, which is the global zone.

During packaging operations in the global zone, IPS recurses into any non-global zones to ensure that packages which need to be kept in sync between the global and non-global zones are kept in sync.

For this to happen, it’s important for the zone to have access to all of the IPS repositories that are available from the global zone.

This is problematic for a few reasons:

  • the zone might not be on the same subnet as the global zone
  • the global-zone administrator might not want to distribute SSL keys/certs for the repos to all zone administrators
  • the global zone might change its publisher configuration, requiring publisher configuration change in every non-global zone

The System Repository

The system repository, and accompanying zones-proxy services was our solution to the list of problems above.

The SMF Services responsible are:

  • svc:/application/pkg/system-repository:default
  • svc:/application/pkg/zones-proxyd:default
  • svc:/application/pkg/zones-proxy-client:default

The first two services run in the global zone, the last one runs in the non-global zones.

With these services, the system repository shares publisher configuration to all non-global zones on the system, and also acts as a conduit to the publishers configured in the global zone. Inside the non-global zone, these proxied global-zone publishers are called system publishers.

When performing packaging operations inside a zone that accesses those publishers, Solaris proxies access through the system repository. While proxying, the system repository also caches any file-content that was
downloaded. If there are lots of zones all downloading the same packaged content, that will be efficiently managed.

Implementation

If you don’t care about how all this works behind the scenes, then you can stop reading now.

There’s three parts to making all of the above work, apart from the initial linked image functionality that Ed worked on, which was fundamental to all of the system repository work.

  • IPS client/repository support
  • Zones proxy
  • System repository

IPS client/repository support

Brock managed the heavy lifting here. This work involved:

  • defining an interchange format that IPS could use to pass publisher configuration between the global and non-global zones
  • refreshing the system repository service on every parent image publisher change
  • allowing local publisher configuration to merge with system publisher configuration
  • ensuring that system-provided publishers could not have their order changed
  • allowing an image to be created that has no publishers
  • toggling use of the system publisher

Zones proxy

The zones proxy client, when started in the non-global zone creates a socket which listens on an inet port on 127.0.0.1. It passes the file descriptor for this socket to the zones proxy daemon via a door call.

The zones proxy daemon then listens for connections on the file descriptor. When the zone proxy daemon receives a connection, it proxies the connection to the system repository.

This allows the zone to access the system repository without any additional networking configuration needed (which I think is pretty neat – nicely done Krister!)

System repository

The system repository itself consists of two components:

  • A Python program, /usr/lib/pkg.sysrepo
  • A custom Apache 2.2 instance

Brock initially prototyped some httpd.conf configurations, and I worked on the code to write them automatically, produce the response that the system repository would use to inform zones of the configured publishers, and also worked out how to proxy access to file-based publishers in the global zone, which was an interesting problem to solve.

When you start the system-repository service in the global zone, pkg.sysrepo(1) determines the enabled, configured publishers then creates a response file served to non-global zones that want to discover the publishers configured in the global zone. It then uses a Mako template from /etc/pkg/sysrepo/sysrepo_httpd.conf.mako to generate an Apache configuration file.

The configuration file describes a basic caching proxy, providing limited access to the URLs of each publisher, as well as allowing URL rewrites to serve any file-based repositories. It uses the SSL keys and certificates from the global zone, and allows proxies access to those from the non-global zone over http.
(remember, data served by the system repository between the zone and non-global zone goes over the zones proxy socket, so http is fine here: access from the proxy to the publisher still goes over https)

The system repository service then starts an Apache instance, and a daemon to keep the proxy cache down to its configured maximum size. More detail on the options available to tune the system repository are in pkg.sysrepo(1) man page.

Result?

The practical upshot of all this, is that all zones can access all publishers configured on the global zone, and if that configuration changes, the zones publishers automatically change too. Of course, non-global zones can add their own publishers, but aren’t allowed to change the order, or disable any system
publishers.

Here’s what the pkg publisher output looks like in a non-global zone:

root@puroto:~# pkg publisher
PUBLISHER                             TYPE     STATUS   URI
solaris                  (non-sticky, syspub) origin   online   proxy://http://pkg.oracle.com/solaris11/release/
mypublisher              (syspub)     origin   online   http://localhost:1008/mypublisher/89227627f3c003d11b1e4c0b5356a965ef7c9712/
test                     (syspub)     origin   online   http://localhost:1008/test/eec48b7c8b107bb3ec9b9cf0f119eb3d90b5303e/

and here’s the system repository running in the global zone:

$ ps -fu pkg5srv | grep httpd
 pkg5srv   206  2334   0 12:02:02 ?           0:00 /usr/apache2/2.2/bin/64/httpd.worker -f /system/volatile/pkg/sysrepo/sysrepo_ht
 pkg5srv   204  2334   0 12:02:02 ?           0:00 /usr/apache2/2.2/bin/64/httpd.worker -f /system/volatile/pkg/sysrepo/sysrepo_ht
 pkg5srv   205  2334   0 12:02:02 ?           0:00 /usr/apache2/2.2/bin/64/httpd.worker -f /system/volatile/pkg/sysrepo/sysrepo_ht
 pkg5srv   939  2334   0 12:46:32 ?           0:00 /usr/apache2/2.2/bin/64/httpd.worker -f /system/volatile/pkg/sysrepo/sysrepo_ht

Personally, I’ve found this capability to be incredibly useful. I work from home, and have a system with an internet-facing non-global zone, and a global zone accessing our corporate VPN. My non-global zone is able to securely access new packages when it needs to (and I get to test my own code at the same time!)

Performing a pkg update from the global zone ensures that all zones are kept in sync, and will update all zones automatically (though, as mentioned in the Zones administration guide, pkg update <list of packages> will simply update the global zone, and ensure that during that update only the packages that cross the kernel/userland boundary are updated in each zone.)

Working on zones and the system repository was a lot of fun – hope you find it useful.