Masayuki Igawa | b853243 | 2016-06-22 17:02:06 +0900 | [diff] [blame] | 1 | Tempest Test Removal Procedure |
| 2 | ============================== |
| 3 | |
| 4 | Historically tempest was the only way of doing functional testing and |
| 5 | integration testing in OpenStack. This was mostly only an artifact of tempest |
| 6 | being the only proven pattern for doing this, not an artifact of a design |
| 7 | decision. However, moving forward as functional testing is being spun up in |
| 8 | each individual project we really only want tempest to be the integration test |
| 9 | suite it was intended to be; testing the high level interactions between |
| 10 | projects through REST API requests. In this model there are probably existing |
| 11 | tests that aren't the best fit living in tempest. However, since tempest is |
| 12 | largely still the only gating test suite in this space we can't carelessly rip |
| 13 | out everything from the tree. This document outlines the procedure which was |
| 14 | developed to ensure we minimize the risk for removing something of value from |
| 15 | the tempest tree. |
| 16 | |
| 17 | This procedure might seem overly conservative and slow paced, but this is by |
| 18 | design to try and ensure we don't remove something that is actually providing |
| 19 | value. Having potential duplication between testing is not a big deal |
| 20 | especially compared to the alternative of removing something which is actually |
| 21 | providing value and is actively catching bugs, or blocking incorrect patches |
| 22 | from landing. |
| 23 | |
| 24 | Proposing a test removal |
| 25 | ------------------------ |
| 26 | |
| 27 | 3 prong rule for removal |
| 28 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 29 | |
| 30 | In the proposal etherpad we'll be looking for answers to 3 questions |
| 31 | |
| 32 | #. The tests proposed for removal must have equiv. coverage in a different |
| 33 | project's test suite (whether this is another gating test project, or an in |
Masayuki Igawa | a8d3cae | 2016-07-11 18:59:23 +0900 | [diff] [blame] | 34 | tree functional test suite). For API tests preferably the other project will |
Masayuki Igawa | b853243 | 2016-06-22 17:02:06 +0900 | [diff] [blame] | 35 | have a similar source of friction in place to prevent breaking api changes |
| 36 | so that we don't regress and let breaking api changes slip through the |
| 37 | gate. |
| 38 | #. The test proposed for removal has a failure rate < 0.50% in the gate over |
| 39 | the past release (the value and interval will likely be adjusted in the |
| 40 | future) |
| 41 | #. There must not be an external user/consumer of tempest that depends on the |
| 42 | test proposed for removal |
| 43 | |
| 44 | The answers to 1 and 2 are easy to verify. For 1 just provide a link to the new |
| 45 | test location. If you are linking to the tempest removal patch please also put |
| 46 | a Depends-On in the commit message for the commit which moved the test into |
| 47 | another repo. |
| 48 | |
| 49 | For prong 2 you can use OpenStack-Health: |
| 50 | |
| 51 | Using OpenStack-Health |
| 52 | """""""""""""""""""""" |
| 53 | |
| 54 | Go to: http://status.openstack.org/openstack-health and then navigate to a per |
| 55 | test page for six months. You'll end up with a page that will graph the success |
| 56 | and failure rates on the bottom graph. For example, something like `this URL`_. |
| 57 | |
| 58 | .. _this URL: http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_volume_boot_pattern.TestVolumeBootPatternV2.test_volume_boot_pattern?groupKey=project&resolutionKey=day&duration=P6M |
| 59 | |
| 60 | The Old Way using subunit2sql directly |
| 61 | """""""""""""""""""""""""""""""""""""" |
| 62 | |
| 63 | SELECT * from tests where test_id like "%test_id%"; |
| 64 | (where $test_id is the full test_id, but truncated to the class because of |
Masayuki Igawa | a8d3cae | 2016-07-11 18:59:23 +0900 | [diff] [blame] | 65 | setupClass or tearDownClass failures) |
Masayuki Igawa | b853243 | 2016-06-22 17:02:06 +0900 | [diff] [blame] | 66 | |
| 67 | You can access the infra mysql subunit2sql db w/ read-only permissions with: |
| 68 | |
| 69 | * hostname: logstash.openstack.org |
| 70 | * username: query |
| 71 | * password: query |
| 72 | * db_name: subunit2sql |
| 73 | |
| 74 | For example if you were trying to remove the test with the id: |
| 75 | tempest.api.compute.admin.test_flavors_negative.FlavorsAdminNegativeTestJSON.test_get_flavor_details_for_deleted_flavor |
| 76 | you would run the following: |
| 77 | |
| 78 | #. run: "mysql -u query -p -h logstash.openstack.org subunit2sql" to connect |
| 79 | to the subunit2sql db |
| 80 | #. run the query: MySQL [subunit2sql]> select * from tests where test_id like |
| 81 | "tempest.api.compute.admin.test_flavors_negative.FlavorsAdminNegativeTestJSON%"; |
| 82 | which will return a table of all the tests in the class (but it will also |
| 83 | catch failures in setupClass and tearDownClass) |
| 84 | #. paste the output table with numbers and the mysql command you ran to |
| 85 | generate it into the etherpad. |
| 86 | |
| 87 | Eventually a cli interface will be created to make that a bit more friendly. |
| 88 | Also a dashboard is in the works so we don't need to manually run the command. |
| 89 | |
| 90 | The intent of the 2nd prong is to verify that moving the test into a project |
| 91 | specific testing is preventing bugs (assuming the tempest tests were catching |
| 92 | issues) from bubbling up a layer into tempest jobs. If we're seeing failure |
| 93 | rates above a certain threshold in the gate checks that means the functional |
| 94 | testing isn't really being effective in catching that bug (and therefore |
| 95 | blocking it from landing) and having the testing run in tempest still has |
| 96 | value. |
| 97 | |
| 98 | However for the 3rd prong verification is a bit more subjective. The original |
| 99 | intent of this prong was mostly for refstack/defcore and also for things that |
| 100 | running on the stable branches. We don't want to remove any tests if that |
| 101 | would break our api consistency checking between releases, or something that |
| 102 | defcore/refstack is depending on being in tempest. It's worth pointing out |
| 103 | that if a test is used in defcore as part of interop testing then it will |
| 104 | probably have continuing value being in tempest as part of the |
| 105 | integration/integrated tests in general. This is one area where some overlap |
| 106 | is expected between testing in projects and tempest, which is not a bad thing. |
| 107 | |
| 108 | Discussing the 3rd prong |
| 109 | """""""""""""""""""""""" |
| 110 | |
| 111 | There are 2 approaches to addressing the 3rd prong. Either it can be raised |
| 112 | during a qa meeting during the tempest discussion. Please put it on the agenda |
| 113 | well ahead of the scheduled meeting. Since the meeting time will be well known |
| 114 | ahead of time anyone who depends on the tests will have ample time beforehand |
| 115 | to outline any concerns on the before the meeting. To give ample time for |
Masayuki Igawa | a8d3cae | 2016-07-11 18:59:23 +0900 | [diff] [blame] | 116 | people to respond to removal proposals please add things to the agenda by the |
Masayuki Igawa | b853243 | 2016-06-22 17:02:06 +0900 | [diff] [blame] | 117 | Monday before the meeting. |
| 118 | |
| 119 | The other option is to raise the removal on the openstack-dev mailing list. |
| 120 | (for example see: http://lists.openstack.org/pipermail/openstack-dev/2016-February/086218.html ) |
| 121 | This will raise the issue to the wider community and attract at least the same |
| 122 | (most likely more) attention than discussing it during the irc meeting. The |
| 123 | only downside is that it might take more time to get a response, given the |
| 124 | nature of ML. |
| 125 | |
| 126 | Exceptions to this procedure |
| 127 | ---------------------------- |
| 128 | |
| 129 | For the most part all tempest test removals have to go through this procedure |
| 130 | there are a couple of exceptions though: |
| 131 | |
| 132 | #. The class of testing has been decided to be outside the scope of tempest. |
| 133 | #. A revert for a patch which added a broken test, or testing which didn't |
| 134 | actually run in the gate (basically any revert for something which |
| 135 | shouldn't have been added) |
| 136 | |
| 137 | For the first exception type the only types of testing in tree which have been |
| 138 | declared out of scope at this point are: |
| 139 | |
| 140 | * The CLI tests (which should be completely removed at this point) |
| 141 | * Neutron Adv. Services testing (which should be completely removed at this |
| 142 | point) |
| 143 | * XML API Tests (which should be completely removed at this point) |
| 144 | * EC2 API/boto tests (which should be completely removed at this point) |
| 145 | |
| 146 | For tests that fit into this category the only criteria for removal is that |
| 147 | there is equivalent testing elsewhere. |
| 148 | |
| 149 | Tempest Scope |
| 150 | ^^^^^^^^^^^^^ |
| 151 | |
| 152 | Also starting in the liberty cycle tempest has defined a set of projects which |
| 153 | are defined as in scope for direct testing in tempest. As of today that list |
| 154 | is: |
| 155 | |
| 156 | * Keystone |
| 157 | * Nova |
| 158 | * Glance |
| 159 | * Cinder |
| 160 | * Neutron |
| 161 | * Swift |
| 162 | |
| 163 | anything that lives in tempest which doesn't test one of these projects can be |
| 164 | removed assuming there is equivalent testing elsewhere. Preferably using the |
| 165 | `tempest plugin mechanism`_ |
Masayuki Igawa | a8d3cae | 2016-07-11 18:59:23 +0900 | [diff] [blame] | 166 | to maintain continuity after migrating the tests out of tempest. |
Masayuki Igawa | b853243 | 2016-06-22 17:02:06 +0900 | [diff] [blame] | 167 | |
| 168 | .. _tempest plugin mechanism: http://docs.openstack.org/developer/tempest/plugin.html |