blob: 42f3084042849ee08b912ff2aa73a672b3407e02 [file] [log] [blame] [view]
Simon Wangd5927a92021-09-13 19:50:45 +08001# Thrift Parameter Validation Proposal
2
3> Version 1.1
4>
5> Dec 15, 2021
6>
7> duanyi.aster@bytedance.com, wangtieju@bytedance.com
8
9### 1. Abstract
10***
11This document presents a proposed set of annotations to the Thrift IDL. The new annotations will supports parameter validation using build-in or third-party validators. The goal of this proposal is to define semantics and behavior of validation annotations, rather than to discuss their implementation.
12
13### 2. Background
14***
15Parameter validation is a common need for web service. In the past, we usually write our validating logics after a RPC message deserialized by thrift. This ways works flexibly enough but restrict poorly: It is dangerous that service A and service B using the same IDL have two different validating rule, which often misdirects developers. Even if we extract our validating codes to a single module, simple and repeated work (ex. `if xx.Field1 > 1 then ...`) is really disturbing. If we can use build tool to generating codes for simple and unchangeable restraint, the web service will be more robust and developers will benefits from lighter work.
16Compared to other IDL, the parameter validation gradually gets strong community supports like PGV ([protoc-gen-validate](https://github.com/envoyproxy/protoc-gen-validate)), benefiting from pb's strong plugin mechanism (lacking official plugin mechanism is one reason for we submit this proposal). Take a long-term view, auto-generated parameter validation may be a step towards code-less web service.
17
18### 3. Proposal
19***
20This proposal includes three part: Validate Annotation Semantics, Validate Rule and Validate Feedback. The first declare how to write a validate annotation, the middle explain how every annotation should behave, the last introduces a mechanism of validating feedback.
21
22#### 3.1 Validate Annotation Semantics
23This semantics uses same rule of [Thrift IDL](https://thrift.apache.org/docs/idl). The validate option only works on struct fields, thus we must start from Field semantics part.
24- Field
25```peg
26Field <- FieldID? FieldReq? FieldType Identifier ('=' ConstValue)? ValidateAnnotations? ListSeparator?
27```
28- ValidateAnnotations
29```peg
30ValidateAnnotations <- '(' ValidateRule+ ListSeparator? ')'
31```
32- ValidateRule
33```peg
34ValidateRule <- ('validate' | 'vt') Validator+ = '"' ValidatingValue? '"'
35```
36- Validator
37
38 Build-in validating logics. See [Supported Validator](#321-supported-validator) part.
39```peg
40Validator <- '.' Identifier
41```
42- ValidatingValue
43```peg
44ValidatingValue <- (ToolFunction '(' )? Arguments ')'?
45```
46- ToolFunction
47
48 Build-in or user-defined tool functions. See [Tool Function](#325-tool-function) part.
49```peg
50ToolFunction <- '@' Identifier
51```
52- Arguments
53```peg
54Arguments <- (DynamicValue ListSeparator?)*
55```
56- DynamicValue
57```peg
58DynamicValue <- ConstValue | FieldReference
59```
60- FieldReference
61
62 See [Field Reference](#324-field-reference) part.
63```apache
64FieldReference <- '$' ReferPath
65ReferPath <- FieldName? ( ('['IntConstant']') | ('.'Identifier) )?
66```
67- All other semantics keep same with [standard definition](https://thrift.apache.org/docs/idl)
68
69### 3.2 Validate Rule
70The validate rule is works as a Boolean Expression, and Validator is core logic for one validate rule. Every Validator works like an Operator, calculating the Validating Value and Field Value, and then compare. For example, `gt` (greater than) will compare the right Validating Value with value of the field it belongs to, and return `true` if field value is greater than value or `false` if field value is not. We appoint that: Only if the validate rule returns true, the validated parameter is valid. If there are several validate rules defined in annotations of a field, Validator will take the logical relation as "and". Simply put, commas in annotations can be treated as "and".
71
72
73#### 3.2.1 Supported Validator
74Below lists the support validators. Value type means the type of validating value, field type means type of validated field.
75
76| validator | behavior | value type | field type | secondary validator |
77| ------------ | ------------------------------------- | ------------------------------------ | ---------------------- | ------------------- |
78| const | must be constant | string, bool | same with value | - |
79| defined_only | must be defined value | enum | enum | - |
80| not_nil | must not be empty | "true" | any | - |
81| skip | skip validate | "true" | any | - |
82| eq | equals to (`==`) | i8, i16, i32, i64, f64, string, bool | same with value | - |
83| ne | not equals to (`!=`) | i8, i16, i32, i64, f64, string, bool | same with value | - |
84| lt | less than (`<`) | i8, i16, i32, i64, f64 | same with value | - |
85| le | less equal (`<=`) | i8, i16, i32, i64, f64 | same with value | - |
86| gt | greater than (`>`) | i8, i16, i32, i64, f64 | same with value | - |
87| ge | greater equal (`>=`) | i8, i16, i32, i64, f64 | same with value | - |
88| in | within given container | i8, i16, i32, i64, f64, enum | same with value | - |
89| not_in | not within given container | i8, i16, i32, i64, f64, enum | same with value | - |
90| elem | field's element constraint | any | list, set | support |
91| key | field's element key constraint | any | map | support |
92| value | field's element value constraint | any | map | support |
93| min_size | minimal length | i8, i16, i32, i64 | string, list, set, map | - |
94| max_size | maximal length | i8, i16, i32, i64 | string, list, set, map | - |
95| prefix | field prefix must be (case-sensitive) | string | string | - |
96| suffix | suffix must be (case-sensitive) | string | string | - |
97| contains | must contain (case-sensitive) | string | string | - |
98| not_contains | must not contain (case-sensitive) | string | string | - |
99| pattern | basic regular expression | string | string | - |
100
101- Basic Regular Expression (BRE), the syntax of BRE can be found in [manual](https://www.gnu.org/software/sed/manual/html_node/BRE-syntax.html) of GNU sed.
102- Secondary validator (`elem`, `key` and `value`) is a successive validator, usually used at container-type field. See below Set/List/Map examples.
103- Add suffix "_escape" to validators to prevent value of rule conflicting with tool function. For example, you can use `"vt.eq_escape" = "@len(A)"` to match literal `@len(A)`.
104
105#### 3.2.2 IDL example
106- Number
107```
108struct NumericDemo{
109 1: double Value (validator.ge = "1000.1", validator.le = "10000.1")
110 2: i8 Type (validator.in = "[1, 2, 4]")
111}
112```
113- String/Binary
114```
115struct StringDemo{
116 1: string Uninitialized (vt.const = "abc")
117 2: string Name (vt.min_size = "6", vt.max_size = "12")
118 3: string SomeStuffs (vt.pattern = "[0-9A-Za-z]+")
119 4: string DebugInfo (vt.prefix = "[Debug]")
120 5: string ErrorMessage (vt.contains = "Error")
121}
122```
123- Bool
124```
125struct BoolDemo {
126 1: bool AMD (vt.const = "true")
127}
128```
129- Enum
130```
131enum Type {
132 Bool
133 I8
134 I16
135 I32
136 I64
137 String
138 Struct
139 List
140 Set
141 Map
142}
143
144struct EnumDemo {
145 1: Type AddressType (vt.in = "[String]")
146 2: Type ValueType (vt.defined_only = "true")
147}
148```
149- Set/List
150```
151struct SetListDemo {
152 1: list<string> Persons (vt.min_size = "5", vt.max_size = "10")
153 2: set<double> HealthPoints (vt.elem.gt = "0")
154}
155```
156- Map
157```
158struct MapDemo {
159 1: map<i32, string> IdName (vt.min_size = "5", vt.max_size = "10")
160 2: map<i32, double> Some (vt.key.gt = "0", vt.value.lt = "1000")
161}
162```
163
164#### 3.2.3 Arguments
165Arguments can by static literals or dynamic variables. If one literal expression contains any Field Reference or Tool Function, it becomes dynamic variables. Every dynamic variables finally get calculated and finally become a Thrift Constant Value.
166
167#### 3.2.4 Field Reference
168Field Reference is used to refer to another field's value in Validating Value, therefore user can compare more than one field. The referenced field must be within same struct. Identifier must be the field name referred.
169- Field Reference Rule
1701. `$x` represents a variable named x, and its scope is within current struct
1712. `$` indicates the current field in which the validator is located
1723. `$x['k']` indicates a reference to the key k of variable x (which must be map)
1734. `$x[i]` indicates a reference to the i + 1 element of variable x (which must be list)
174- Example
175```
176struct FieldReferenceExample {
177 1: string A (vt.eq = "$B") //field A must equal to field B
178 2: list<string> C
179}
180```
181
182#### 3.2.5 Tool Function
183Tool Function is use to enhance the operating of Validating Value. For example, if we want to ensure one field is larger than the length of string field A, we can use `len()` function: `vt.gt = "@len($A)"`. The arguments can be either literals or variables, and no size limit. However, we won't suggest any build-in function here, because the category is too big and always language-related. Instead, we only propose one mechanism for thrift developers to extends their implementation according to used language.
184
185Supported functions:
186| function | behavior | arguments | results | supported language |
187| -------- | ----------------------- | ----------------------------------- | -------- | ------------------ |
188| len | the length of the field | 1: string, binary, list, set or map | 1. int64 | go |
189
190### 3.3 Feedback
191The generated validating codes should be included in struct's `Validate() TApplicationException` method. If all validate rule declared by one struct get passed, the struct's `Validate() TApplicationException` method returns nil (or just returns without exception, depending on specific language implementation); Otherwise it returns `TApplicationException` and report feedback message indicating failure reason. Due to language function implementations are different, we won't constrain the interface of feedback messages. However, by practice we suggest developers to give below three detail information:
192
193- The position where first validating failure happens.
194- The validator who reports the failure.
195- The red-handed field value and validating value when the failure happens