You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
271 lines
10 KiB
271 lines
10 KiB
5 years ago
|
As discussed in Chapter~\ref{sec:funcheck-theory}, to assist writing purely
|
||
|
functional code, a linter needs to be implemented that detects reassignments within
|
||
|
a Go program.
|
||
|
|
||
|
To get a grasp on the issues this linter should report, the first step
|
||
|
is to capture some examples, cases that should be matched against.
|
||
|
|
||
|
\subsection{Examples}
|
||
|
|
||
|
The simplest cases are standalone reassignments and assignment operators:
|
||
|
\begin{gocode}
|
||
|
x := 5
|
||
|
x = 6 // forbidden
|
||
|
// or
|
||
|
var y = 5
|
||
|
y = 6 // forbidden
|
||
|
y += 6 // forbidden
|
||
|
y <<= 2 // forbidden
|
||
|
y++ // forbidden
|
||
|
\end{gocode}
|
||
|
|
||
|
Where the statements with a \mintinline{go}|// forbidden
|
||
|
|comment should be reported.
|
||
|
|
||
|
Adding block scoping to this, shadowing the old variable needs to be allowed:
|
||
|
\begin{gocode}
|
||
|
x := 5
|
||
|
{
|
||
|
x = 6 // forbidden, changing the old value
|
||
|
x := 6 // allowed, as this shadows the old variable
|
||
|
}
|
||
|
\end{gocode}
|
||
|
|
||
|
What should be illegal is to declare the variable first and then assign a
|
||
|
value to it:
|
||
|
\begin{gocode}
|
||
|
var x int
|
||
|
x = 6 // forbidden
|
||
|
\end{gocode}
|
||
|
|
||
|
The exception here are functions, as they need to be declared first in order
|
||
|
to recursively call them:
|
||
|
\begin{gocode}
|
||
|
var f func()
|
||
|
f = func() {
|
||
|
f()
|
||
|
}
|
||
|
\end{gocode}
|
||
|
|
||
|
Furthermore, the linter also needs to be able to handle multiple variables
|
||
|
at once:
|
||
|
\begin{gocode}
|
||
|
var f func()
|
||
|
x, f, y := 1, func() { f() }, 2
|
||
|
\end{gocode}
|
||
|
|
||
|
Loops should be reported too, as they are using reassignments internally:
|
||
|
\begin{gocode}
|
||
|
for i := 0; i < 5; i++ { // forbidden
|
||
|
for i != 3 { // forbidden
|
||
|
for { // allowed
|
||
|
// ...
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
\end{gocode}
|
||
|
|
||
|
All the aforementioned examples and more can be found in the test cases for funcheck\autocite{funcheck-examples}.
|
||
|
|
||
|
\subsection{Building a linter}
|
||
|
|
||
|
The Go ecosystem already provides an official library for building code analysis tools,
|
||
|
the `analysis' package from the Go Tools repository\autocite{go-analysis}. With this package,
|
||
|
implementing a static code analyser is reduced to writing the actual AST node analysis.
|
||
|
|
||
|
To define an analysis, a variable of type \mintinline{go}|*analysis.Analyzer| has to be declared:
|
||
|
|
||
|
\begin{gocode}
|
||
|
var Analyzer = &analysis.Analyzer{
|
||
|
Name: "assigncheck",
|
||
|
Doc: "reports reassignments",
|
||
|
Run: func(*analysis.Pass) (interface{}, error)
|
||
|
}
|
||
|
\end{gocode}
|
||
|
|
||
|
The necessary steps are now adding the `Run' function and registering the analyser
|
||
|
in the \mintinline{go}|main()| function.
|
||
|
|
||
|
The `Run' function takes an \mintinline{go}|*analysis.Pass| type. The Pass provides
|
||
|
information about the package that is being analysed and some helper-functions to report
|
||
|
diagnostics.
|
||
|
|
||
|
With `analysis.Pass.Files` and the help of the `go/ast` package, traversing the syntax
|
||
|
tree of every file in a package becomes extremely convenient:
|
||
|
|
||
|
\begin{gocode}
|
||
|
for _, file := range pass.Files {
|
||
|
ast.Inspect(file, func(n ast.Node) bool {
|
||
|
// node analysis here
|
||
|
})
|
||
|
}
|
||
|
\end{gocode}
|
||
|
|
||
|
To implement funcheck as described, five different AST node types need to be
|
||
|
taken care of. The simpler ones are
|
||
|
\mintinline{go}|*ast.IncDecStmt|, \mintinline{go}|*ast.ForStmt| and \mintinline{go}|*ast.RangeStmt|.
|
||
|
An `IncDecStmt' node is a \mintinline{go}|x++| or \mintinline{go}|x--|
|
||
|
expression and should always be reported.
|
||
|
`ForStmt' and `RangeStmt' are similar; a `RangeStmt' is a `for' loop with the
|
||
|
\mintinline{go}|range| keyword instead of an init-, condition- and post-statement.
|
||
|
|
||
|
Both of these loop-types need to be reported explicitly as they do not show up
|
||
|
as reassignments in the AST.
|
||
|
Thus, the basic building block for the AST traversal is the following \mintinline{go}|switch|
|
||
|
statement:
|
||
|
\begin{code}
|
||
|
\gofilerange{../work/funcheck/assigncheck/assigncheck.go}{start-basictypes}{end-basictypes}%
|
||
|
\caption{Handling the basic AST types in funcheck\autocite{funcheck-ast-types}}
|
||
|
\end{code}
|
||
|
The remaining two node types are \mintinline{go}|*ast.DeclStmt| and \mintinline{go}|*ast.AssignStmt|.
|
||
|
They are not as simple to handle, which is why they are covered in their own chapters.
|
||
|
|
||
|
\subsection{Detecting reassignments}
|
||
|
|
||
|
To recapitulate, the goal of this step is to detect all assignments except blank identifiers
|
||
|
(discarded values cannot be mutated) and function literals, if the function is declared in the
|
||
|
last statement\footnote{This rule is to simplify the logic of the checker and make it easier
|
||
|
for developers to read the code. It means that no code may be between \mintinline{go}|var f func|
|
||
|
and \mintinline{go}|f = func() { ... }|.}.
|
||
|
|
||
|
To detect such reassignments, funcheck iterates over all identifiers on the left-hand side
|
||
|
of an assignment statement.
|
||
|
|
||
|
On the left-hand side of an assignment is a list of expressions. These expressions can be
|
||
|
identifiers, index expressions (\mintinline{go}|*ast.IndexExpr|, for map and slice access),
|
||
|
a `star expression' (\mintinline{go}|*ast.StarExpr|\footnote{star expressions
|
||
|
are expressions that are prefixed by an asterisk, dereferencing a pointer. For example
|
||
|
\mintinline{go}|*x = 5|, if \mintinline{go}|x| is of type \mintinline{go}|*int|.}) or others.
|
||
|
|
||
|
If the expression is not an identifier, the assignment must be a reassignment, as all non-identifier
|
||
|
expressions contain an already declared identifier. For example, the slice index expression
|
||
|
\mintinline{go}|s[5]| is of type \mintinline{go}|*ast.IndexExpr|:
|
||
|
\begin{gocode}
|
||
|
// An IndexExpr node represents an expression followed by an index.
|
||
|
IndexExpr struct {
|
||
|
X Expr // expression
|
||
|
Lbrack token.Pos // position of "["
|
||
|
Index Expr // index expression
|
||
|
Rbrack token.Pos // position of "]"
|
||
|
}
|
||
|
\end{gocode}
|
||
|
|
||
|
Where \mintinline{go}|IndexExpr.X| is our identifier `s' (of type \mintinline{go}|*ast.Ident|)
|
||
|
and a \mintinline{go}|IndexExpr.Index| is \mintinline{go}|5| (of type \mintinline{go}|*ast.BasicLit|).
|
||
|
|
||
|
As these nested identifiers already need to be declared beforehand (else they could not be used
|
||
|
in the expression), all expressions on the left-hand side of an assignment that are not identifiers
|
||
|
are reassignments.
|
||
|
|
||
|
Identifiers are the only expressions that can occur in declarations and reassignments. A naive
|
||
|
approach would be to check for the colon in a short variable declaration (\mintinline{go}|:=|).
|
||
|
However, as touched upon in Chapter~\ref{sec:multi-assign}, even short variable declarations may
|
||
|
contain redeclarations, if at least one variable is new.
|
||
|
|
||
|
Thus, another approach is needed.
|
||
|
|
||
|
Every identifier (an AST node with type \mintinline{go}|*ast.Ident|) contains an object\footnote{`An
|
||
|
object describes a named language entity such as a package, constant, type, variable,
|
||
|
function (incl. methods), or a label'\autocite{go-ast-object}.} that links to the declaration.
|
||
|
|
||
|
This declaration, of whatever type it may be, always has a position (and a corresponding function
|
||
|
to retrieve that position) in the source file.
|
||
|
|
||
|
A reassignment is detected if an identifier's declaration position does not match the assignment's
|
||
|
position (indicating that the variable is being assigned at a different place to where it is
|
||
|
declared).
|
||
|
|
||
|
This is illustrated in the code block~\ref{code:assign-pos}. What can be clearly seen is that in
|
||
|
the assignment \mintinline{go}|y = 3|, \mintinline{go}|y|'s declaration refers to the position
|
||
|
of the first assignment \mintinline{go}|x, y := 1, 2|, the position where \mintinline{go}|y| has
|
||
|
been declared.
|
||
|
\begin{listing}
|
||
|
\begin{gocode}
|
||
|
Assignment "x, y := 1, 2": 2958101
|
||
|
Ident "x": 2958101
|
||
|
Decl "x, y := 1, 2": 2958101
|
||
|
Ident "y": 2958104
|
||
|
Decl "x, y := 1, 2": 2958101
|
||
|
Assignment "y = 3": 2958115
|
||
|
Ident "y": 2958115
|
||
|
Decl "x, y := 1, 2": 2958101
|
||
|
\end{gocode}
|
||
|
\caption{Illustration of an assignment node and corresponding positions\autocite{ast-positions}\label{code:assign-pos}}
|
||
|
\end{listing}
|
||
|
|
||
|
As this technique works on an identifier level, multi-variable declarations or assignments
|
||
|
can be verified without any additional effort.
|
||
|
|
||
|
If a variable in a short variable declaration is being reassigned, the variable's `Declaration'
|
||
|
field will point to the original position of its declaration, which can be easily detected
|
||
|
(as shown in code block~\ref{code:assign-pos}).
|
||
|
|
||
|
\subsection{Handling function declarations}
|
||
|
|
||
|
In contrast to all other variable types, function variables may be `reassigned' once.
|
||
|
As discussed in Chapter~\ref{sec:func-reassign}, this is to allow recursive function
|
||
|
literals. Detecting and not reporting these assignments is a two-step process, as two
|
||
|
consecutive AST nodes need to be inspected.
|
||
|
|
||
|
The first step is to detect function declarations; statements of the form
|
||
|
\mintinline{go}|var f func() |. Should such a statement be encountered,
|
||
|
its position is saved for the following AST node.
|
||
|
|
||
|
In the consecutive AST node it is ensured that, if the node is an assignment and
|
||
|
the assignee identifier is of type function literal, the position matches the
|
||
|
previously saved one.
|
||
|
|
||
|
The position of the declaration and AST node structure can be seen in~\ref{code:func-reassign}
|
||
|
|
||
|
\begin{listing}
|
||
|
\begin{gocode}
|
||
|
Declaration "var f func() int": 2958142
|
||
|
Ident "f func() int": 2958146
|
||
|
Assignment "f = func() int { return y }": 2958160
|
||
|
Ident "f": 2958160
|
||
|
Decl "f func() int": 2958146
|
||
|
\end{gocode}
|
||
|
\caption{Illustration of a function literal assignment\autocite{ast-positions}\label{code:func-reassign}}
|
||
|
\end{listing}
|
||
|
|
||
|
With this technique it is possible to exempt functions from the reassignment rule.
|
||
|
|
||
|
\subsection{Testing Funcheck}
|
||
|
|
||
|
The analysis-package is distributed with a sub-package `analysistest'. This package makes
|
||
|
it extremely simple to test a code analysis tool.
|
||
|
|
||
|
By providing test data and the expected messages from funcheck in a structured way, all
|
||
|
that is needed to test funcheck is:
|
||
|
|
||
|
\begin{code}
|
||
|
\begin{gocode}
|
||
|
package assigncheck
|
||
|
|
||
|
import (
|
||
|
"testing"
|
||
|
|
||
|
"golang.org/x/tools/go/analysis/analysistest"
|
||
|
)
|
||
|
|
||
|
func TestRun(t *testing.T) {
|
||
|
analysistest.Run(t, analysistest.TestData(), Analyzer)
|
||
|
}
|
||
|
\end{gocode}
|
||
|
\caption{Testing a code analyser with the `analysistest' package}
|
||
|
\end{code}
|
||
|
|
||
|
The library expects the test data in the current directory in a folder named `testdata' and
|
||
|
then spawns and executes the analyser on the files in that folder. Comments in those files
|
||
|
are used to describe the expected message:
|
||
|
|
||
|
\begin{gocode}
|
||
|
x := 5
|
||
|
fmt.Println(x)
|
||
|
x = 6 // want `^reassignment of x$`
|
||
|
fmt.Println(x)
|
||
|
\end{gocode}
|
||
|
|
||
|
This will ensure that on the line \mintinline{go}|x = 6| an error message is reported that says
|
||
|
`reassignment of x'.
|