Reflections on the (first annual) OpenDev Conference, SFO

Earlier this week, I attended the OpenDev conference in San Francisco, CA.

The conference was focused on the emerging “edge computing” use cases for the cloud. This is an area that is of particular interest, not just from the obvious applicability to my ‘day job’ at Verizon, but also from the fact that it opens up an interesting new set of opportunities for distributed computing applications.

The highlight(s) of the show were two keynotes by M. Satyanarayanan of CMU. Both sessions were video taped and I’m hoping that the videos will be made available soon.

His team is working on some real cool stuff, and he showed off some of their work. The one that I found most fascinating, which most completely illustrates the value of edge computing is the augmented reality application to playing table tennis (which they call ping pong, and I know that annoys a lot of people :))

It was great to hear a user perspective presented by Andrew Mitry of Walmart. With 11,000 stores and an enormous (2mm??) employees, their edge computing use-case truly represents the scale at which these systems will have to operate, and the benefits that they can bring to the enterprise.

The conference sessions were very interesting and some of my key takeaways were that:

  • Edge Computing means different things to different people, because the term ‘Edge’ means different things to different applications. In some cases the edge device may be in a data center, in other cases in your houses, and in other cases on top of a lamp post at the end of your street.
  • A common API in orchestrating applications across the entirety of the cloud is very important, but different technologies may be better suited to each location in the cloud. There was a lot of discussion of the value (or lack thereof) of having OpenStack at the edge, and whether it made sense for edge devices to be orchestrated by OpenStack (or not).
  • I think an enormous amount of time was spent on debating whether or not OpenStack could be made to fit on a system with limited resources and I found this discussion to be rather tiring. After all, OpenStack runs fine on a little raspberry-pi and for a deployment where there will be relatively few OpenStack operations (instance, volume, security group creation, update, deletetion) the limited resources at the edge should be more than sufficient.
  • There are different use-cases for edge-computing and NFV/VNF are not the only ones, and while they may be the early movers into this space, they may be unrepresentative of the larger market opportunity presented by the edge.

There is a lot of activity going on in the edge computing space and many of the things we’re doing at Verizon fall into that category. There were several sessions that showcased some of the things that we have been doing, AT&T had a couple of sessions describing their initiatives in the space as well.

There was a very interesting discussion of the edge computing use-cases and the etherpad for that session can be found here.

Some others who attended the session also posted summaries on their blogs. This one from Chris Dent provides a good summary.

A conclusion/wrap-up session identified some clear follow-up activities. The etherpad for that session can be found here.

Troubleshooting OpenStack gate issues with logstash

From time to time, I have to figure out why the Trove CI failed some job. By “from to time”, I really mean “constantly”, “all the time”, and “everyday”.

Very often the issue is some broken change that someone pushed up, easy ones are pep8 or pylint failures, slightly harder are the py27/py35 failures. The really hard ones are failures in the Trove scenario tests.

Very often the failures are transient and a recheck fixes them (which is annoying in itself) but sometimes the failure is repeatable.

In the past week, I’ve had to deal with one such issue; I first realized that it was a repeated failure after about a dozen rechecks of various changes.

I realized that the change had a telltale signature that looked like this:

Test Replication Instance Multi-Promote functionality.
Test promoting a replica to replica source (master).        SKIP: Failure in <function wait_for_delete_non_affinity_master at 0x7f1aafd53320>
Verify data is still on new master.                         SKIP: Failure in <function wait_for_delete_non_affinity_master at 0x7f1aafd53320>
Add data to new master to verify replication.               SKIP: Failure in <function wait_for_delete_non_affinity_master at 0x7f1aafd53320>
Verify data exists on new master.                           SKIP: Failure in <function wait_for_delete_non_affinity_master at 0x7f1aafd53320>

The important part of the message (I realized later) was the part that read:

Failure in <function wait_for_delete_non_affinity_master ...

Looking a bit above this, I found the test that had in fact failed

Wait for the non-affinity master to delete.                 FAIL

One thing important in this kind of debugging is to try and figure out when this failure really started to happen, and that’s one of the places where logstash comes in really handy.

For every single CI job run by the OpenStack integrated gate, the result artifacts are parsed and some of them are indexed in an elasticsearch database.

It is trivial now to pick up the string that I felt was the ‘signature’ and search for it in logstash. Within seconds I can tell that this error began to occur on 4/11.

This, by itself was not sufficient to figure out what the problem was, but once Matt Riedemann identified a probable cause, I was able to confirm that the problem started occurring shortly after that change merged.

Logstash is a really really powerful tool, give it a shot, you’ll find it very useful.

Reflections on the first #OpenStack PTG (Pike PTG, Atlanta)

A very long time ago, and on a planet very unlike the earth we are on now, Thierry sent this email [1] proposing that after Barcelona, let’s split the OpenStack Summit into what have now come to be known as the Design Summit and a Forum. The email thread that resulted from [1] was an active one and had 125 responses, along the way Thierry also posted [2], a summary of the issues and concerns raised.

I had my reservations about the idea, and now, after returning from the first of these, have had some time to reflect on the result.

On the whole, the event was very well done, and I believe that everyone who attended the feedback session had positive things to say about the event, and my congratulations to Erin Disney and the rest of the team at the Foundation. The attendance was a solid 500 to 600 people (I don’t know the exact number) and Thierry must be psychic because he predicted almost exactly that in February [3].

pike-coffeeI did not realize that the foundation had gone as far as to get Starbucks to customize a blend of coffee for us, and to get the Sheraton to distribute in our rooms (image courtesy of Masayuki Igawa, @masayukig).

The format gave attendees the opportunity to get a significant amount of work done both within their own project teams, as well as with other project teams, without the interruptions and distractions of the summit.

I particularly liked the fact that I could attend two days of cross project sessions and then two and a half days of sessions with other projects. By giving projects two or three room-days instead of four or five room-hours dramatically improved the amount of time that projects could focus on their own discussions, and spend on cross project discussions.

Personally, I think the PTG was a success, and seems to have delivered most if not all of the things that it set out to do. Some things outside of our control, certainly outside the control of the foundation have cast a small shadow on the proceedings and we need to seriously consider the consequences of where the next summit and PTG’s are. The location has implication for many attendees, and I think we should seriously consider having remote participation at future events.

From my recollection of the feedback session (unfortunately I don’t have a link to the etherpad, if someone has it, please post a comment with it) everyone had good things to say about the event as a whole. The consensus was that the food was good but cold sandwiches get boring after day 3, the air handlers were noisy, the rooms were too cold (or hot), the chairs were uncomfortable, and there was no soda. That feedback is consistent with organizing an event for 500 people in a hotel or convention facility anywhere in the world. And if that’s all that people could put down in the “needs improvement” section, the event was a huge success.

designateI think the award for the best picture at the summit (thanks to Thierry for the tweet) goes to the Designate team[4]. I should’ve thought to get a Trove team photo while we were there!

[1] http://markmail.org/thread/v6h3qzs7rb35h6fo

[2] http://markmail.org/message/slzcvunoxccse5k4

[3] http://markmail.org/message/bultywgyxued5khl

[4] https://twitter.com/tcarrez/status/835149239571316736/photo/1

Automating OpenStack’s gerrit commands with a CLI

Every OpenStack developer has to interact with the gerrit code review system. Reviewers and core reviewers have to do this even more, and PTL’s do a lot of this.

The web-based interface is not conducive to many of the more common things that one has to do while managing a project and early on, I started using the gerrit query CLI.

Along the way, I started writing a simple CLI that I could use to automate more things and recently a few people asked about these tools and whether I’d share.

I’m not claiming that this is unique, or that this hasn’t been done before; it evolved slowly and there may be a better set of tools out there that does all of this (and more). I don’t know about them. If you have similar tools, please do share (comment below).

So, I’ve cleaned up this tools a bit (removed things like my private key, username and password) and made them available here.

Full disclosure, they are kind of rough at the edges and you could cause yourself some grief if you aren’t quite sure what you are doing.

Here’s a quick introduction

Installation

It should be nothing more than cloning the repository git@github.com/amrith/gerrit-cli and running the install command. Note, I use python 2.7 as my default python on Ubuntu 16.04. If you use python 3.x, your mileage may vary.

Simple commands

The simplest command is ‘ls’ to list reviews

gerrit-cli ls owner:self

As you can see, the search here is a standard gerrit query search.

You don’t have to type complex queries everytime, you can store and reuse queries. A very simple configuration file is used for this (a sample configuration file is also provided and gets installed by default).

amrith@amrith-work:~$ cat .gerrit-cli/gerrit-cli.json
{
    # global options
    "host": "review.openstack.org",
    "port": 29418,

    # "dry-run": true,

    # user defined queries
    "queries": {
        # each query is necessarily a list, even if it is a single string
        "trove-filter": ["(project:openstack/trove-specs OR project:openstack/trove OR project:openstack/trove-dashboard OR project:openstack/python-troveclient OR project:openstack/trove-integration)"],

        # the simple filter uses the trove-filter and appends status:open and is therefore a list

        "simple": ["trove-filter", "status:open"],

        "review-list": ["trove-filter", "status:open", "NOT label:Code-Review>=-2,self"],

        "commitids": ["simple"],

        "older-than-two-weeks": ["simple", "age:2w"]
    },

    # user defined results
    "results": {
        # each result is necessarily a list, even if it is a single column
        "default": ["number:r", "project:l", "owner:l", "subject:l:80", "state", "age:r"],
        "simple": ["number:r", "project:l", "owner:l", "subject:l:80", "state", "age:r"],
        "commitids": [ "number:r", "subject:l:60", "owner:l", "commitid:l", "patchset:r" ],
        "review-list": [ "number:r", "project:l", "branch:c", "subject:l:80", "owner:l", "state", "age:r" ]
    }
}

The file is a simple JSON and you can comment lines just as you would in python (#…)

Don’t do anything, just – – dry-run

The best way to see what’s going on is to use the –dry-run command (or to be sure, uncomment the line in your configuration file).

amrith@amrith-work:~$ gerrit-cli --dry-run ls owner:self
ssh review.openstack.org -p 29418 gerrit query --format=JSON --current-patch-set --patch-sets --all-approvals owner:self
+--------+---------+-------+---------+-------+-----+
| Number | Project | Owner | Subject | State | Age |
+--------+---------+-------+---------+-------+-----+
[...]
+--------+---------+-------+---------+-------+-----+

So the owner:self query makes a gerrit query and formats and displays the output as shown above.

So, what columns are displayed? The configuration contains a section called “results” and a default result is defined there.

"default": ["number:r", "project:l", "owner:l", "subject:l:80", "state", "age:r"],

You can override the default and cause a different set of columns to be shown. If a default is not found, the code has a hardcoded default as well.

Similarly, you could run the query:

amrith@amrith-work:~$ gerrit-cli --dry-run ls
ssh review.openstack.org -p 29418 gerrit query --format=JSON --current-patch-set --patch-sets --all-approvals owner:self status:open
+--------+---------+-------+---------+-------+-----+
| Number | Project | Owner | Subject | State | Age |
+--------+---------+-------+---------+-------+-----+
+--------+---------+-------+---------+-------+-----+

and a default query will be generated for you, that query is owner:self and status:open.

You can nest these definitions as shown in the default configuration.

amrith@amrith-work:~$ gerrit-cli --dry-run ls commitids
ssh review.openstack.org -p 29418 gerrit query --format=JSON --current-patch-set --patch-sets --all-approvals (project:openstack/trove-specs OR project:openstack/trove OR project:openstack/trove-dashboard OR project:openstack/python-troveclient OR project:openstack/trove-integration) status:open
+--------+---------+-------+---------+-------+-----+
| Number | Project | Owner | Subject | State | Age |
+--------+---------+-------+---------+-------+-----+
+--------+---------+-------+---------+-------+-----+

The query “commitids” is expanded as follows.

commitids -> simplesimple -> trove-filter, statusopentrove-filter -> (...)

What else can I do?

You can do a lot more than just list reviews …

amrith@amrith-work:~$ gerrit-cli --help
usage: gerrit [-h] [--host HOST] [--port PORT] [--dry-run]
              [--config-file CONFIG_FILE] [-v]
              {ls,show,update,abandon,restore,recheck} ...

A simple gerrit command line interface

positional arguments:
  {ls,show,update,abandon,restore,recheck}
    ls                  list reviews
    show                show review(s)
    update              update review(s)
    abandon             abandon review(s)
    restore             restore review(s)
    recheck             abandon review(s)

optional arguments:
  -h, --help            show this help message and exit
  --host HOST           The gerrit host. Default: review.openstack.org
  --port PORT           The gerrit port. Default: 29418
  --dry-run             Whether or not to actually execute commands that
                        modify a review.
  --config-file CONFIG_FILE
                        The path to the gerrit-cli configuration file to use
                        for this session. (Default: ~/.gerrit-cli/gerrit-
                        cli.json
  -v, --verbose         Provide additional (verbose) debug output.

Other things that I do quite often (and like to automate) are update, abandon, restore and recheck.

A word of caution: when you aren’t sure what the command will do, use –dry-run. Otherwise, you could end up in a world of hurt.

Like, when you accidentally abandon a 100 reviews 🙂

And even if you know what your query should do, remember I’ve hidden some choice bugs in the code. You may hit those too.

Enjoy!

I’ll update the readme with more information when I get some time.

Effective OpenStack contribution: Seven things to avoid at all cost

There are numerous blogs and resources for the new and aspiring OpenStack contributor, providing tips, listing what to do. Here are seven things to avoid if you want to be an effective OpenStack contributor.

I wrote one of these.

There have been presentations at summits that share other useful newbie tips as well, here is one.

Project repositories often include a CONTRIBUTING.rst file that shares information for newcomers. Here is the file for the Trove project.

Finally, many of these resources include a pointer to the OpenStack Developer’s Guide.

Over the past three years, I have seen several newbie mistakes repeated over and over again and in thinking about some recent incidents, I think the community has not done a good job documenting these “Don’t Do’s”.Just don't do it!

So here is a start; here are seven things you shouldn’t do, if you want to be an effective OpenStack contributor.


1. Don’t submit empty commit messages

captureThe commit message is a useful part of the commit and it serves to inform reviewers about what the change is, and how your proposed fix addresses the problem. In general, (with the notable exception of procedural commits for things like releases or infrastructure), the commit message should not be empty. The words “Trivial Fix” do not suffice.

OpenStack documents best practices for commit messages. Make sure your commit message provides a succinct description of the problem, describes how you propose to fix it, and includes a reference (via the Close-Bug, Partial-Bug or Related-Bug tags) to the Launchpad entry for the issue you are fixing.


2. Don’t expect that reviews are automatic

In OpenStack, reviewing changes is a community activity. If you propose changes, they get merged because others in the community contribute their time and effort in reviewing your changes. This wouldn’t work unless everyone participates in the review process.

Just because you submitted some changes, don’t expect others to feel motivated or obligated to review your changes. In many projects, review bandwidth is at a premium and therefore you will have a better chance getting your change reviewed and approved if you reciprocate and review other people’s changes.


3. Don’t leave empty reviews

captureWhen you review someone’s code, merely adding a +1 serves no useful purpose. At the very least indicate what you did with the change. Equally useful is to say what you did not do.

For example, you could indicate that you only reviewed the code and did not actually test it out. Or you could go further and download and test the patch set and indicate in your review comment that you tested the change and found it to work. On occasion, such as when I review a change for the first time, I will indicate that I have reviewed the changes but not the tests.

Feel free to ask questions about the change if you don’t follow what is being done. Also, feel free to suggest alternate implementations if you feel that the proposed implementation is not the best one for some reason(s).

Don’t feel shy about marking changes with a -1 if you feel that it is not ready to merge for some reason.

A drive-by +1 is a generally unhelpful activity, and if you persist at doing that, others in the community will tend to discount your reviews anyway.


4. Don’t game the Stackalytics system

captureBy far, the most egregious violation that I’ve seen is when people blatantly try to game the Stackalytics system. Stackalytics is a tool that tracks individual and company participation in OpenStack.

Here, for example, is the Stackalytics page for the Trove project in the current release:

Reviews: http://stackalytics.com/?module=trove-group

Commits: http://stackalytics.com/?module=trove-group&metric=commits

It allows you to see many metrics in a graphical way, and allows you to slice and dice the data in a number of interesting ways.

New contributors, bubbling with enthusiasm often fall into the trap of trying to game the system and rack up reviews or commits. This can end up very badly for you if you go down this route. For example, recently one very enthusiastic person showed up with a change that got blasted out to about 150 projects, and attempted to add a CONTRIBUTING.rst file to all of these projects. What ensued is documented in this mailing list thread:

A few of the changes were merged before they were reverted, the vast majority were abandoned.

Changes like this serve no real useful purpose. They also consume an inordinate amount of resources in the CI system. I computed that the little shenanigan described above generated approximately 1050 CI jobs and consumed about 190 hours of time on the CI system.

I admit that numbers are important and they are a good indication of participation. But quality is a much more important metric because quality is an indicator of contribution. And I firmly believe that participation is about showing up, contribution is about what you do once you are here, and contribution is a way more important thing to aim for than participation.


5. Don’t ignore review comments

Finally, when you’ve submitted a change, and people review and provide comments, don’t ignore them. If you are serious about a change, you will stay with it till it gets merged. Respond to comments in a timely manner, if only to say that you will come back with a new patch set in some time.

If you don’t, remember that review bandwidth is a scarce resource and in the future your changes may get scant attention from reviewers. Others who review your changes are taking time out of their schedules to participate in the community. At the very least you should recognize and respect that investment on their part and reciprocate with timely responses.


6. Don’t be shy

And above all, if you aren’t sure how to proceed, don’t be shy. Post a question on the mailing list if you aren’t sure what to do about something. If that’s too public for you (and that’s perfectly alright), ask the question on the IRC channel for the project in question. If that is too public, find someone who is active on the project (start with the PTL) and send that person an email.

An important aspect of the role of a PTL is fielding those questions, and all of us (PTL’s) receive several of these questions each month. Not sure whom to ask, then toss the question out on IRC at #openstack or #openstack-dev and you should receive an answer before long.


7. Don’t be an IRC ghost

ghost_single-15An important thing to remember about IRC is that it is an asynchronous medium. So, don’t expect answers in real time. The OpenStack community is highly distributed, but also most active during daylight hours, Monday to Friday in US time. If you pop up on IRC, ask a question and then disappear, you may not get your answer. If you can’t stick around for a long time on IRC, then post your question to the mailing list.

But better still, there are many ways in which you can connect to IRC and leave the connection up (so you can read the scrollback), or find some other mechanism to review the scrollback (like eavesdrop.openstack.org) to see if your question was answered.


If you have your own pet peeve, please share it in the comments section. I hope this will become a useful resource for aspiring OpenStack contributors.

Addressing a common misconception regarding OpenStack Trove security

Since my first OpenStack Summit in Atlanta (mid 2014), I have been to a number of OpenStack-related events, meetups, and summits. And at every one of these events, as well as numerous customer and prospect meetings, I’ve been asked some variant of the question:

Isn’t Trove insecure because the guestagent has RabbitMQ credentials?

A bug was entered in 2015 with the ominous (and factually inaccurate) description that reads “Guestagent config leaks rabbit password”.

And while I’ve tried to explain to people that this is not at all the case, this misconception has persisted.

At the Summit in Barcelona, I was asked yet again about this and I realized that obviously, whatever we in the Trove team had been doing to communicate the reality was insufficient. So, in preparation for the upcoming Summit in Boston, I’m writing this post as a handy resource.

What is the problem?

Shown here is a simplified representation of a Trove system with a single guest database instance. The control plane components (Trove API, Trove Task Manager, and Trove Conductor) and the Guest Agent communicate via oslo.messaging which is typically implemented with some messaging transport like RabbitMQ.

rpc-security-1To connect to the underlying transport, each of these four components needs to store credentials; for RabbitMQ this is a username and password.

The contention is that if a guest instance is somehow compromised (and there are many ways to do this) and a bad actor gains access to the RabbitMQ credentials, then the OpenStack deployment is compromised.

Why is this not really a problem?

Here are some reasons this is not really an issue on a properly configured production system.

  1. Nothing requires that Trove use the same RabbitMQ servers as the rest of OpenStack. So at the very least, the compromise can be limited to the RabbitMQ servers used by Trove.
  2. The guest instance is not intended to be a general-purpose instance that a user has access to; in the intended deployment, the only connectivity to the guest instance would be to the database ports for queries. These are configurable with each database (datastore) and enforced by Neutron. Shell access (port 22, ssh) is a no-no. No deployer would use images and configurations that allowed this kind of access.
  3. On the guest instance, other database specific best practices are used to prevent shell escapes and other exploits that will give a user access to the RabbitMQ credentials.
  4. Guest instances can be spawned by Trove using service credentials, or credentials for a shadow tenant to prevent an end user from directly accessing the underlying Nova instance. Similarly Cinder volumes can be provisioned with a different tenant to prevent an end user from directly accessing the underlying volume.

All of this notwithstanding, the urban legend was that Trove was a security risk. The reason invariably involved a system configured by devstack, with a single RabbitMQ, open access to port 22 on the guest, run in the same tenant as the requestor of the database.

Yet, one can safely say that no one in their right mind would operate OpenStack as configured by devstack in production. And certainly, with Trove, one would not use the development images whose elements are part of the source tree in a production deployment.

proposed security related improvements in Ocata

In the Ocata release, one additional set of changes has been made to further secure the system. All RPC calls on the oslo.messaging bus are completely encrypted. Furthermore, different conversations are encrypted using unique encryption keys.

rpc-security-2The messaging traffic on oslo.messaging is solely for oslo_messaging.rpc, the OpenStack Remote Procedure Call mechanism. The API service makes calls into the Task Manager, the Task Manager makes calls into the Guest Agent, and the Guest Agent makes calls into the Conductor.

The picture above shows these different conversations, and the encryption keys used on each. When the API service makes an RPC call to the Task Manager, all parameters to the call are encrypted using K1 which is stored securely on the control plane.

Unique encryption keys are created for each guest instance, and these keys are used for all communication. When the Task Manager wishes to make a call to Guest Agent 1, it uses the instance specific key K2, and when it wants to make a call to Guest Agent 2, it uses the instance specific key K3. When the guest agents want to make calls to the Conductor, the traffic is encrypted using the instance specific keys and the conductor decrypts the parameters using those instance specific keys.

In a well configured production deployment, one that takes steps to secure the system, if a bad actor were to compromise a guest instance (say Guest Agent 1) and get access to K2 and the RabbitMQ Credentials, the user could access RabbitMQ but would not be able to do anything to impact either another guest instance (he wouldn’t have K3) or the Task Manager (he wouldn’t have K1).

Code that implements this capability is currently in upstream review.


This blog post resulted in a brief twitter exchange with Adam Young (@admiyoung)

Unfortunately, a single user (in RabbitMQ) for Trove isn’t the answer. Should a guest get compromised, then those credentials are sufficient to post messages to RabbitMQ and cause some amount of damage.

One would need per guest instance credentials to avoid this; or one of the many other solutions (like shadow tenants, etc).

Enabling hacking extensions: The right way

Of late, I wake up every morning revving to go and work on the next cool thing in Trove and I see that overnight some well-meaning person has contributed a change that looks something like this:

String interpolation should be delayed to be handled by the logging code, rather than being done at the point of the logging call.
Ref:http://docs.openstack.org/developer/oslo.i18n/guidelines.html#log-translation For example:
# WRONG
LOG.info(_LI(‘some message: variable=%s’) % variable)
# RIGHT
LOG.info(_LI(‘some message: variable=%s’), variable)

And the code submitted fixes a small number (lets say 5) places where strings sent to logging are rendered.

As I said at the TC-Board meeting in Barcelona, these well-meaning people are actually submitting what on the face of it appear to be valid corrections to the code. Yet, I submit to you that these changes represent a scourge that we should stamp out.

I know for a fact that in Trove there are (currently) 751 occurrences of this particular style error. This is the hacking extension H904, and when enabled in Trove, I get this:

$ tox -e pep8 | grep H904 | wc -l
751

That’s the catch, Trove does not enable this hacking extension. A quick look indicates that only Neutron does.

Why are these well meaning changes a scourge? Here’s why …

  • They don’t materially improve a project to fix a small fraction of these errors without preventing them from reoccurring
  • Each of these changes takes some considerable CI resources to verify and get approved
  • Each of these changes take time for someone to review, time which could be better spent if we were to fix these problems properly.

So, I submit to you that if you want to submit a patch to fix one of these hacking issues, here is the right way. Of course, I’m opinionated, I’m going to reference one of my own changes as an example!

  1. If your project does not have hacking extensions, this commit shows you what you have to do to enable that. You may have to bump test-requirements.txt and update the version of hacking that you use in order to use the ‘enable-extensions’ option.
  2. Enable the hacking rule or extension for the particular style issue at hand; let’s illustrate with H203. H203 ensures that we use assertNone() and not assertEqual(None, …).
  3. Run the pep8 test against the project and find and correct all places where the failure occurs. Typically this is accomplished by just running ‘tox -e pep8’.
  4. Test that the code does in fact work as expected; correcting style guidelines can introduce functional errors so make sure that the unit tests pass too. Typically this is accomplished by running ‘tox -e py27 -e py34’.
  5. Actually exercise the system; launch a system with devstack and the project enabled, and actually exercise the system. In the case of Trove, actually build a guest and launch a database or two.
  6. Then submit your change including the change to tox.ini that enables the hacking rule for review.

Well, that’s a lot of work! Sure, you really have to work for your Stackalytics credit, right? I’m sure the load on the CI system will show that this is worthwhile.

It is better to do things this way in the long run. With the hacking rule enabled, future commits will also comply with the rule (they will fail pep8 if they don’t). And that will put an end to the cottage industry that has sprung up around finding these kinds of errors and fixing them one at a time.

In conclusion I urge reviewers in all projects to summarily reject style changes that don’t also enable a hacking rule. Approving them is the wrong thing to do. Require the contributor to enable the hacking rule, and fix the problem the right way. That’s what a good code review is about.