diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS new file mode 100644 index 0000000..c7d0efe --- /dev/null +++ b/.github/CODEOWNERS @@ -0,0 +1 @@ +* @AlekSi diff --git a/.github/ISSUE_TEMPLATE.md b/.github/ISSUE_TEMPLATE.md new file mode 100644 index 0000000..f6aa9f6 --- /dev/null +++ b/.github/ISSUE_TEMPLATE.md @@ -0,0 +1,16 @@ +*Please use this template for reporting suspected bugs or requests for help.* + +# Issue description + +# Environment + +* Gosh version (commit hash if unreleased): +* `go env` output: + +# Minimal test code / Steps to reproduce the issue + +1. + +# What's the actual result? (include panic message & call stack if applicable) + +# What's the expected result? diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..6348e3a --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,34 @@ +# Pull Request Notice + +Before sending a pull request make sure each commit solves one clear, minimal, +plausible problem. Further each commit should have the following format: + +``` +Problem: X is broken + +Solution: do Y and Z to fix X +``` + +Please try to have the code changes conform to the style of the surrounding code. + +Please avoid sending a pull request with recursive merge nodes, as they +are impossible to fix once merged. Please rebase your branch on +top of `master` instead of merging it. + +``` +git remote add upstream git@github.com:gosh-lang/gosh.git +git fetch upstream +git rebase upstream/master +git push -f +``` + +In case you already merged instead of rebasing you can drop the merge commit. + +``` +git rebase -i HEAD~10 +``` + +Now, find your merge commit and mark it as drop and save. Finally rebase! + +If you are a new contributor please have a look at our contributing guidelines: +[CONTRIBUTING.md](../CONTRIBUTING.md) diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..59395ce --- /dev/null +++ b/.gitignore @@ -0,0 +1,9 @@ +/.idea/ +/.vscode/ + +/vendor/ + +/internal/gofuzz/workdir/ + +*.out +*.test diff --git a/.golangci.yml b/.golangci.yml new file mode 100644 index 0000000..7db6339 --- /dev/null +++ b/.golangci.yml @@ -0,0 +1,37 @@ +--- +linters-settings: + govet: + use-installed-packages: true + + gocyclo: + min-complexity: 30 + + maligned: + suggest-new: true + + # prevent import of go/ast, go/parser, etc. + depguard: + list-type: blacklist + include-go-root: true + packages: + - go/ + + lll: + line-length: 140 + tab-width: 4 + + unused: + check-exported: true + + unparam: + algo: rta + check-exported: true + +linters: + enable-all: true + +issues: + exclude-use-default: false + exclude: + # gas: Duplicates errcheck, but with more false positives like strings.Builder.WriteString + - "G104: Errors unhandled" diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..a5bf980 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,5 @@ +## v0.0.1 "Smile" (2016-09-28) + +* First public version. +* You can write FizzBuzz in it! +* But not much else. diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000..b0865f3 --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,73 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +In the interest of fostering an open and welcoming environment, we as +contributors and maintainers pledge to making participation in our project and +our community a harassment-free experience for everyone, regardless of age, body +size, disability, ethnicity, sex characteristics, gender identity and expression, +level of experience, education, socio-economic status, nationality, personal +appearance, race, religion, or sexual identity and orientation. + +## Our Standards + +Examples of behavior that contributes to creating a positive environment +include: + +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members + +Examples of unacceptable behavior by participants include: + +* The use of sexualized language or imagery and unwelcome sexual attention or + advances +* Trolling, insulting/derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or electronic + address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Our Responsibilities + +Project maintainers are responsible for clarifying the standards of acceptable +behavior and are expected to take appropriate and fair corrective action in +response to any instances of unacceptable behavior. + +Project maintainers have the right and responsibility to remove, edit, or +reject comments, commits, code, wiki edits, issues, and other contributions +that are not aligned to this Code of Conduct, or to ban temporarily or +permanently any contributor for other behaviors that they deem inappropriate, +threatening, offensive, or harmful. + +## Scope + +This Code of Conduct applies both within project spaces and in public spaces +when an individual is representing the project or its community. Examples of +representing a project or community include using an official project e-mail +address, posting via an official social media account, or acting as an appointed +representative at an online or offline event. Representation of a project may be +further defined and clarified by project maintainers. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported by contacting the project team at conduct@gosh-lang.org. All +complaints will be reviewed and investigated and will result in a response that +is deemed necessary and appropriate to the circumstances. The project team is +obligated to maintain confidentiality with regard to the reporter of an incident. +Further details of specific enforcement policies may be posted separately. + +Project maintainers who do not follow or enforce the Code of Conduct in good +faith may face temporary or permanent repercussions as determined by other +members of the project's leadership. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, +available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html + +[homepage]: https://www.contributor-covenant.org diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..1020ddd --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,11 @@ +# Contributing Guidelines + +Gosh, as an experiment, uses the [C4 process](https://rfc.zeromq.org/spec:42/C4/). +Please do take the time to read it. + +If you feel like it is too heavyweight and you rather not contribute at all than use it, +please at least follow those 2 basic rules: + +1. Always create an issue first. Always communicate before writing code. +2. Please use the suggested commit message format and pull request body format. + That allows us to use them in the CHANGELOG.md without changes. diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..a612ad9 --- /dev/null +++ b/LICENSE @@ -0,0 +1,373 @@ +Mozilla Public License Version 2.0 +================================== + +1. Definitions +-------------- + +1.1. "Contributor" + means each individual or legal entity that creates, contributes to + the creation of, or owns Covered Software. + +1.2. "Contributor Version" + means the combination of the Contributions of others (if any) used + by a Contributor and that particular Contributor's Contribution. + +1.3. "Contribution" + means Covered Software of a particular Contributor. + +1.4. "Covered Software" + means Source Code Form to which the initial Contributor has attached + the notice in Exhibit A, the Executable Form of such Source Code + Form, and Modifications of such Source Code Form, in each case + including portions thereof. + +1.5. "Incompatible With Secondary Licenses" + means + + (a) that the initial Contributor has attached the notice described + in Exhibit B to the Covered Software; or + + (b) that the Covered Software was made available under the terms of + version 1.1 or earlier of the License, but not also under the + terms of a Secondary License. + +1.6. "Executable Form" + means any form of the work other than Source Code Form. + +1.7. "Larger Work" + means a work that combines Covered Software with other material, in + a separate file or files, that is not Covered Software. + +1.8. "License" + means this document. + +1.9. "Licensable" + means having the right to grant, to the maximum extent possible, + whether at the time of the initial grant or subsequently, any and + all of the rights conveyed by this License. + +1.10. "Modifications" + means any of the following: + + (a) any file in Source Code Form that results from an addition to, + deletion from, or modification of the contents of Covered + Software; or + + (b) any new file in Source Code Form that contains any Covered + Software. + +1.11. "Patent Claims" of a Contributor + means any patent claim(s), including without limitation, method, + process, and apparatus claims, in any patent Licensable by such + Contributor that would be infringed, but for the grant of the + License, by the making, using, selling, offering for sale, having + made, import, or transfer of either its Contributions or its + Contributor Version. + +1.12. "Secondary License" + means either the GNU General Public License, Version 2.0, the GNU + Lesser General Public License, Version 2.1, the GNU Affero General + Public License, Version 3.0, or any later versions of those + licenses. + +1.13. "Source Code Form" + means the form of the work preferred for making modifications. + +1.14. "You" (or "Your") + means an individual or a legal entity exercising rights under this + License. For legal entities, "You" includes any entity that + controls, is controlled by, or is under common control with You. For + purposes of this definition, "control" means (a) the power, direct + or indirect, to cause the direction or management of such entity, + whether by contract or otherwise, or (b) ownership of more than + fifty percent (50%) of the outstanding shares or beneficial + ownership of such entity. + +2. License Grants and Conditions +-------------------------------- + +2.1. Grants + +Each Contributor hereby grants You a world-wide, royalty-free, +non-exclusive license: + +(a) under intellectual property rights (other than patent or trademark) + Licensable by such Contributor to use, reproduce, make available, + modify, display, perform, distribute, and otherwise exploit its + Contributions, either on an unmodified basis, with Modifications, or + as part of a Larger Work; and + +(b) under Patent Claims of such Contributor to make, use, sell, offer + for sale, have made, import, and otherwise transfer either its + Contributions or its Contributor Version. + +2.2. Effective Date + +The licenses granted in Section 2.1 with respect to any Contribution +become effective for each Contribution on the date the Contributor first +distributes such Contribution. + +2.3. Limitations on Grant Scope + +The licenses granted in this Section 2 are the only rights granted under +this License. No additional rights or licenses will be implied from the +distribution or licensing of Covered Software under this License. +Notwithstanding Section 2.1(b) above, no patent license is granted by a +Contributor: + +(a) for any code that a Contributor has removed from Covered Software; + or + +(b) for infringements caused by: (i) Your and any other third party's + modifications of Covered Software, or (ii) the combination of its + Contributions with other software (except as part of its Contributor + Version); or + +(c) under Patent Claims infringed by Covered Software in the absence of + its Contributions. + +This License does not grant any rights in the trademarks, service marks, +or logos of any Contributor (except as may be necessary to comply with +the notice requirements in Section 3.4). + +2.4. Subsequent Licenses + +No Contributor makes additional grants as a result of Your choice to +distribute the Covered Software under a subsequent version of this +License (see Section 10.2) or under the terms of a Secondary License (if +permitted under the terms of Section 3.3). + +2.5. Representation + +Each Contributor represents that the Contributor believes its +Contributions are its original creation(s) or it has sufficient rights +to grant the rights to its Contributions conveyed by this License. + +2.6. Fair Use + +This License is not intended to limit any rights You have under +applicable copyright doctrines of fair use, fair dealing, or other +equivalents. + +2.7. Conditions + +Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted +in Section 2.1. + +3. Responsibilities +------------------- + +3.1. Distribution of Source Form + +All distribution of Covered Software in Source Code Form, including any +Modifications that You create or to which You contribute, must be under +the terms of this License. You must inform recipients that the Source +Code Form of the Covered Software is governed by the terms of this +License, and how they can obtain a copy of this License. You may not +attempt to alter or restrict the recipients' rights in the Source Code +Form. + +3.2. Distribution of Executable Form + +If You distribute Covered Software in Executable Form then: + +(a) such Covered Software must also be made available in Source Code + Form, as described in Section 3.1, and You must inform recipients of + the Executable Form how they can obtain a copy of such Source Code + Form by reasonable means in a timely manner, at a charge no more + than the cost of distribution to the recipient; and + +(b) You may distribute such Executable Form under the terms of this + License, or sublicense it under different terms, provided that the + license for the Executable Form does not attempt to limit or alter + the recipients' rights in the Source Code Form under this License. + +3.3. Distribution of a Larger Work + +You may create and distribute a Larger Work under terms of Your choice, +provided that You also comply with the requirements of this License for +the Covered Software. If the Larger Work is a combination of Covered +Software with a work governed by one or more Secondary Licenses, and the +Covered Software is not Incompatible With Secondary Licenses, this +License permits You to additionally distribute such Covered Software +under the terms of such Secondary License(s), so that the recipient of +the Larger Work may, at their option, further distribute the Covered +Software under the terms of either this License or such Secondary +License(s). + +3.4. Notices + +You may not remove or alter the substance of any license notices +(including copyright notices, patent notices, disclaimers of warranty, +or limitations of liability) contained within the Source Code Form of +the Covered Software, except that You may alter any license notices to +the extent required to remedy known factual inaccuracies. + +3.5. Application of Additional Terms + +You may choose to offer, and to charge a fee for, warranty, support, +indemnity or liability obligations to one or more recipients of Covered +Software. However, You may do so only on Your own behalf, and not on +behalf of any Contributor. You must make it absolutely clear that any +such warranty, support, indemnity, or liability obligation is offered by +You alone, and You hereby agree to indemnify every Contributor for any +liability incurred by such Contributor as a result of warranty, support, +indemnity or liability terms You offer. You may include additional +disclaimers of warranty and limitations of liability specific to any +jurisdiction. + +4. Inability to Comply Due to Statute or Regulation +--------------------------------------------------- + +If it is impossible for You to comply with any of the terms of this +License with respect to some or all of the Covered Software due to +statute, judicial order, or regulation then You must: (a) comply with +the terms of this License to the maximum extent possible; and (b) +describe the limitations and the code they affect. Such description must +be placed in a text file included with all distributions of the Covered +Software under this License. Except to the extent prohibited by statute +or regulation, such description must be sufficiently detailed for a +recipient of ordinary skill to be able to understand it. + +5. Termination +-------------- + +5.1. The rights granted under this License will terminate automatically +if You fail to comply with any of its terms. However, if You become +compliant, then the rights granted under this License from a particular +Contributor are reinstated (a) provisionally, unless and until such +Contributor explicitly and finally terminates Your grants, and (b) on an +ongoing basis, if such Contributor fails to notify You of the +non-compliance by some reasonable means prior to 60 days after You have +come back into compliance. Moreover, Your grants from a particular +Contributor are reinstated on an ongoing basis if such Contributor +notifies You of the non-compliance by some reasonable means, this is the +first time You have received notice of non-compliance with this License +from such Contributor, and You become compliant prior to 30 days after +Your receipt of the notice. + +5.2. If You initiate litigation against any entity by asserting a patent +infringement claim (excluding declaratory judgment actions, +counter-claims, and cross-claims) alleging that a Contributor Version +directly or indirectly infringes any patent, then the rights granted to +You by any and all Contributors for the Covered Software under Section +2.1 of this License shall terminate. + +5.3. In the event of termination under Sections 5.1 or 5.2 above, all +end user license agreements (excluding distributors and resellers) which +have been validly granted by You or Your distributors under this License +prior to termination shall survive termination. + +************************************************************************ +* * +* 6. Disclaimer of Warranty * +* ------------------------- * +* * +* Covered Software is provided under this License on an "as is" * +* basis, without warranty of any kind, either expressed, implied, or * +* statutory, including, without limitation, warranties that the * +* Covered Software is free of defects, merchantable, fit for a * +* particular purpose or non-infringing. The entire risk as to the * +* quality and performance of the Covered Software is with You. * +* Should any Covered Software prove defective in any respect, You * +* (not any Contributor) assume the cost of any necessary servicing, * +* repair, or correction. This disclaimer of warranty constitutes an * +* essential part of this License. No use of any Covered Software is * +* authorized under this License except under this disclaimer. * +* * +************************************************************************ + +************************************************************************ +* * +* 7. Limitation of Liability * +* -------------------------- * +* * +* Under no circumstances and under no legal theory, whether tort * +* (including negligence), contract, or otherwise, shall any * +* Contributor, or anyone who distributes Covered Software as * +* permitted above, be liable to You for any direct, indirect, * +* special, incidental, or consequential damages of any character * +* including, without limitation, damages for lost profits, loss of * +* goodwill, work stoppage, computer failure or malfunction, or any * +* and all other commercial damages or losses, even if such party * +* shall have been informed of the possibility of such damages. This * +* limitation of liability shall not apply to liability for death or * +* personal injury resulting from such party's negligence to the * +* extent applicable law prohibits such limitation. Some * +* jurisdictions do not allow the exclusion or limitation of * +* incidental or consequential damages, so this exclusion and * +* limitation may not apply to You. * +* * +************************************************************************ + +8. Litigation +------------- + +Any litigation relating to this License may be brought only in the +courts of a jurisdiction where the defendant maintains its principal +place of business and such litigation shall be governed by laws of that +jurisdiction, without reference to its conflict-of-law provisions. +Nothing in this Section shall prevent a party's ability to bring +cross-claims or counter-claims. + +9. Miscellaneous +---------------- + +This License represents the complete agreement concerning the subject +matter hereof. If any provision of this License is held to be +unenforceable, such provision shall be reformed only to the extent +necessary to make it enforceable. Any law or regulation which provides +that the language of a contract shall be construed against the drafter +shall not be used to construe this License against a Contributor. + +10. Versions of the License +--------------------------- + +10.1. New Versions + +Mozilla Foundation is the license steward. Except as provided in Section +10.3, no one other than the license steward has the right to modify or +publish new versions of this License. Each version will be given a +distinguishing version number. + +10.2. Effect of New Versions + +You may distribute the Covered Software under the terms of the version +of the License under which You originally received the Covered Software, +or under the terms of any subsequent version published by the license +steward. + +10.3. Modified Versions + +If you create software not governed by this License, and you want to +create a new license for such software, you may create and use a +modified version of this License if you rename the license and remove +any references to the name of the license steward (except to note that +such modified license differs from this License). + +10.4. Distributing Source Code Form that is Incompatible With Secondary +Licenses + +If You choose to distribute Source Code Form that is Incompatible With +Secondary Licenses under the terms of this version of the License, the +notice described in Exhibit B of this License must be attached. + +Exhibit A - Source Code Form License Notice +------------------------------------------- + + This Source Code Form is subject to the terms of the Mozilla Public + License, v. 2.0. If a copy of the MPL was not distributed with this + file, You can obtain one at http://mozilla.org/MPL/2.0/. + +If it is not possible or desirable to put the notice in a particular +file, then You may include the notice in a location (such as a LICENSE +file in a relevant directory) where a recipient would be likely to look +for such a notice. + +You may add additional accurate notices of copyright ownership. + +Exhibit B - "Incompatible With Secondary Licenses" Notice +--------------------------------------------------------- + + This Source Code Form is "Incompatible With Secondary Licenses", as + defined by the Mozilla Public License, v. 2.0. diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..d474026 --- /dev/null +++ b/Makefile @@ -0,0 +1,23 @@ +all: test + +init: + go get -u golang.org/x/tools/cmd/stringer \ + github.com/dvyukov/go-fuzz/... \ + github.com/golangci/golangci-lint/cmd/golangci-lint + + go test -i -v ./... + +install: + go generate ./... + go install -v ./... + +test: install + go build -v -tags gofuzz ./... + go test -v ./scanner + go test -v ./parser + go test -v ./interpreter + go test -v -covermode=count -coverprofile=cover.out ./... + +check: install + go run misc/check_license.go + golangci-lint run ./... diff --git a/README.md b/README.md new file mode 100644 index 0000000..9d8bfab --- /dev/null +++ b/README.md @@ -0,0 +1,54 @@ +# Gosh + +Gosh is an interpreted language for Go ecosystem written in Go. + +It is in super-early pre-alpha stage. It is also an experiment in community building. +We are looking for the brave souls who are interested in contributing and be a subject of this experiment. +For everyone else: please check out this project from time to time to see when it moves to +the alpha/beta stage. + +## Contributing + +See [Contributing Guidelines](CONTRIBUTING.md). + +# License + +Copyright © 2018 Alexey Palazhchenko and contributors. +This Source Code Form is subject to the terms of the Mozilla Public +License, v. 2.0. If a copy of the MPL was not distributed with this +file, You can obtain one at http://mozilla.org/MPL/2.0/. + +Current code is partially based on the source code of the Monkey programming language +from the book [Writing An Interpreter In Go by Thorsten Ball](https://interpreterbook.com). + +``` +Copyright (c) 2016-2017 Thorsten Ball + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +``` + +Current code uses the following decencies: + +* https://gopkg.in/alecthomas/kingpin.v2 ([MIT License](https://github.com/alecthomas/kingpin/blob/v2.2.6/COPYING)) +* https://github.com/alecthomas/template ([BSD 3-Clause Revised License](https://github.com/alecthomas/template/blob/master/LICENSE)) +* https://github.com/alecthomas/units ([MIT License](https://github.com/alecthomas/units/blob/master/COPYING)) +* https://github.com/davecgh/go-spew ([ISC License](https://github.com/davecgh/go-spew/blob/master/LICENSE)) +* https://github.com/peterh/liner ([X11 License](https://github.com/peterh/liner/blob/master/COPYING)) +* https://github.com/pmezard/go-difflib ([BSD 3-Clause Revised License](https://github.com/pmezard/go-difflib/blob/master/LICENSE)) +* https://github.com/stretchr/testify ([MIT License](https://github.com/stretchr/testify/blob/master/LICENSE)) diff --git a/ast/ast.go b/ast/ast.go new file mode 100644 index 0000000..7bd8db8 --- /dev/null +++ b/ast/ast.go @@ -0,0 +1,41 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// Package ast declares the types used to represent syntax trees for Gosh packages. +package ast + +import ( + "fmt" + "strings" +) + +// Node is a common interface for all AST nodes. +type Node interface { + fmt.Stringer + node() +} + +// Program is a root of AST tree. +type Program struct { + Statements []Statement +} + +func (p *Program) String() string { + var res strings.Builder + for _, s := range p.Statements { + res.WriteString(s.String()) + res.WriteString(";\n") + } + return res.String() +} + +func (p *Program) node() {} + +// check interfaces +var ( + _ Node = (*Program)(nil) +) diff --git a/ast/expressions.go b/ast/expressions.go new file mode 100644 index 0000000..95ce1e4 --- /dev/null +++ b/ast/expressions.go @@ -0,0 +1,171 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package ast + +import ( + "strings" + + "gosh-lang.org/gosh/tokens" +) + +// Expression is a common interface for all AST expression nodes. +type Expression interface { + Node + expression() +} + +// Identifier represents an identifier expression. +type Identifier struct { + Token tokens.Token // tokens.IDENT + Value string +} + +func (i *Identifier) String() string { + return i.Value +} + +func (i *Identifier) node() {} +func (i *Identifier) expression() {} + +// IntegerLiteral represents an integer literal expression. +type IntegerLiteral struct { + Token tokens.Token // tokens.INT + Value int +} + +func (il *IntegerLiteral) String() string { + return il.Token.Literal +} + +func (il *IntegerLiteral) node() {} +func (il *IntegerLiteral) expression() {} + +// StringLiteral represents a string literal expression. +type StringLiteral struct { + Token tokens.Token // tokens.INT + Value string +} + +func (sl *StringLiteral) String() string { + return sl.Token.Literal +} + +func (sl *StringLiteral) node() {} +func (sl *StringLiteral) expression() {} + +// BooleanLiteral represents a boolean literal expression. +type BooleanLiteral struct { + Token tokens.Token // tokens.TRUE or tokens.FALSE + Value bool +} + +func (bl *BooleanLiteral) String() string { + return bl.Token.Literal +} + +func (bl *BooleanLiteral) node() {} +func (bl *BooleanLiteral) expression() {} + +// PrefixExpression represents prefix expression (e.g. `!x`). +type PrefixExpression struct { + Token tokens.Token // tokens.NOT, tokens.SUB + Right Expression +} + +func (pe *PrefixExpression) String() string { + var res strings.Builder + res.WriteString("(") + res.WriteString(pe.Token.Literal) + res.WriteString(pe.Right.String()) + res.WriteString(")") + return res.String() +} + +func (pe *PrefixExpression) node() {} +func (pe *PrefixExpression) expression() {} + +// InfixExpression represents infix expression (e.g. `x + y`). +type InfixExpression struct { + Token tokens.Token // tokens.ADD, tokens.SUB, etc. + Left Expression + Right Expression +} + +func (ie *InfixExpression) String() string { + var res strings.Builder + // res.WriteString("(") + res.WriteString(ie.Left.String()) + res.WriteString(" ") + res.WriteString(ie.Token.Literal) + res.WriteString(" ") + res.WriteString(ie.Right.String()) + // res.WriteString(")") + return res.String() +} + +func (ie *InfixExpression) node() {} +func (ie *InfixExpression) expression() {} + +// FunctionLiteral represents a function literal expression. +type FunctionLiteral struct { + Token tokens.Token + Parameters []*Identifier + Body *BlockStatement +} + +func (fl *FunctionLiteral) String() string { + params := make([]string, len(fl.Parameters)) + for i, p := range fl.Parameters { + params[i] = p.String() + } + + var res strings.Builder + res.WriteString("func(") + res.WriteString(strings.Join(params, ", ")) + res.WriteString(") ") + res.WriteString(fl.Body.String()) + return res.String() +} + +func (fl *FunctionLiteral) node() {} +func (fl *FunctionLiteral) expression() {} + +// CallExpression represents a call expression. +type CallExpression struct { + Token tokens.Token + Function Expression + Arguments []Expression +} + +func (ce *CallExpression) String() string { + args := make([]string, len(ce.Arguments)) + for i, a := range ce.Arguments { + args[i] = a.String() + } + + var res strings.Builder + res.WriteString(ce.Function.String()) + res.WriteString("(") + res.WriteString(strings.Join(args, ", ")) + res.WriteString(")") + return res.String() +} + +func (ce *CallExpression) node() {} +func (ce *CallExpression) expression() {} + +// check interfaces +var ( + _ Expression = (*Identifier)(nil) + _ Expression = (*IntegerLiteral)(nil) + _ Expression = (*BooleanLiteral)(nil) + _ Expression = (*PrefixExpression)(nil) + _ Expression = (*InfixExpression)(nil) + _ Expression = (*FunctionLiteral)(nil) + _ Expression = (*CallExpression)(nil) +) diff --git a/ast/statements.go b/ast/statements.go new file mode 100644 index 0000000..b79d52d --- /dev/null +++ b/ast/statements.go @@ -0,0 +1,208 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package ast + +import ( + "strings" + + "gosh-lang.org/gosh/tokens" +) + +// Statement is a common interface for all AST statement nodes. +type Statement interface { + Node + statement() +} + +// IncrementDecrementStatement represents increment or decrement statement (e.g. `x++`, `x--`). +type IncrementDecrementStatement struct { + Token tokens.Token // tokens.Increment or tokens.Decrement + Name *Identifier // TODO it can be a more complex expression +} + +func (ids *IncrementDecrementStatement) String() string { + var res strings.Builder + res.WriteString(ids.Name.String()) + res.WriteString(ids.Token.Literal) + return res.String() +} + +func (ids *IncrementDecrementStatement) node() {} +func (ids *IncrementDecrementStatement) statement() {} + +// VarStatement represents a var statement. +type VarStatement struct { + Token tokens.Token // tokens.Var + Name *Identifier + Value Expression +} + +func (vs *VarStatement) String() string { + var res strings.Builder + res.WriteString("var ") + res.WriteString(vs.Name.String()) + if vs.Value != nil { + res.WriteString(" = ") + res.WriteString(vs.Value.String()) + } + return res.String() +} + +func (vs *VarStatement) node() {} +func (vs *VarStatement) statement() {} + +// AssignStatement represents an assign statement. +type AssignStatement struct { + Token tokens.Token // tokens.Assignment or tokens.XXXAssignment + Name *Identifier // TODO it can be a more complex expression + Value Expression +} + +func (as *AssignStatement) String() string { + var res strings.Builder + res.WriteString(as.Name.String()) + res.WriteString(" ") + res.WriteString(as.Token.Literal) + res.WriteString(" ") + res.WriteString(as.Value.String()) + return res.String() +} + +func (as *AssignStatement) node() {} +func (as *AssignStatement) statement() {} + +// ReturnStatement represents a return statement. +type ReturnStatement struct { + Token tokens.Token // tokens.Return + Value Expression +} + +func (rs *ReturnStatement) String() string { + var res strings.Builder + res.WriteString("return") + if rs.Value != nil { + res.WriteString(" ") + res.WriteString(rs.Value.String()) + } + return res.String() +} + +func (rs *ReturnStatement) node() {} +func (rs *ReturnStatement) statement() {} + +// ContinueStatement represents a continue statement. +type ContinueStatement struct { + Token tokens.Token // tokens.Continue +} + +func (cs *ContinueStatement) String() string { + return "continue" +} + +func (cs *ContinueStatement) node() {} +func (cs *ContinueStatement) statement() {} + +// IfStatement represent if/else statement. +type IfStatement struct { + Token tokens.Token // tokens.If + // Init *AssignStatement // initialization statement; or nil // TODO it also can be a define statement + Cond Expression // condition; or nil + Body *BlockStatement + // Else Statement // else branch; or nil +} + +func (is *IfStatement) String() string { + var res strings.Builder + res.WriteString("if (") + res.WriteString(is.Cond.String()) + res.WriteString(") ") + res.WriteString(is.Body.String()) + return res.String() +} + +func (is *IfStatement) node() {} +func (is *IfStatement) statement() {} + +// ForStatement represent a for statement. +type ForStatement struct { + Token tokens.Token // tokens.For + Init *AssignStatement // initialization statement; or nil // TODO it also can be a define statement + Cond Expression // condition; or nil + Post Statement // post iteration statement; or nil + Body *BlockStatement +} + +func (fs *ForStatement) String() string { + var res strings.Builder + res.WriteString("for ") + if fs.Init != nil { + res.WriteString(fs.Init.String()) + } + res.WriteString("; ") + if fs.Cond != nil { + res.WriteString(fs.Cond.String()) + } + res.WriteString("; ") + if fs.Post != nil { + res.WriteString(fs.Post.String()) + } + res.WriteString(" ") + if fs.Body != nil { + res.WriteString(fs.Body.String()) + } + return res.String() +} + +func (fs *ForStatement) node() {} +func (fs *ForStatement) statement() {} + +// ExpressionStatement represents an expression when it is used as a statement. +type ExpressionStatement struct { + Token tokens.Token // first token of expression + Expression Expression +} + +func (es *ExpressionStatement) String() string { + if es.Expression != nil { + return es.Expression.String() + } + return "" +} + +func (es *ExpressionStatement) node() {} +func (es *ExpressionStatement) statement() {} + +// BlockStatement represents a block statement. +type BlockStatement struct { + Token tokens.Token // tokens.LBRACE + Statements []Statement +} + +func (bs *BlockStatement) String() string { + var res strings.Builder + res.WriteString("{\n") + for _, s := range bs.Statements { + res.WriteString(s.String() + ";\n") + } + res.WriteString("}") + return res.String() +} + +func (bs *BlockStatement) node() {} +func (bs *BlockStatement) statement() {} + +// check interfaces +var ( + _ Statement = (*IncrementDecrementStatement)(nil) + _ Statement = (*VarStatement)(nil) + _ Statement = (*AssignStatement)(nil) + _ Statement = (*ReturnStatement)(nil) + _ Statement = (*ForStatement)(nil) + _ Statement = (*ExpressionStatement)(nil) + _ Statement = (*BlockStatement)(nil) +) diff --git a/go.mod b/go.mod new file mode 100644 index 0000000..76a9149 --- /dev/null +++ b/go.mod @@ -0,0 +1,11 @@ +module gosh-lang.org/gosh + +require ( + github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc // indirect + github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf // indirect + github.com/davecgh/go-spew v1.1.1 + github.com/peterh/liner v1.1.0 + github.com/pmezard/go-difflib v1.0.0 + github.com/stretchr/testify v1.2.2 + gopkg.in/alecthomas/kingpin.v2 v2.2.6 +) diff --git a/go.sum b/go.sum new file mode 100644 index 0000000..26c6d58 --- /dev/null +++ b/go.sum @@ -0,0 +1,16 @@ +github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc h1:cAKDfWh5VpdgMhJosfJnn5/FoN2SRZ4p7fJNX58YPaU= +github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc= +github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf h1:qet1QNfXsQxTZqLG4oE62mJzwPIB8+Tee4RNCL9ulrY= +github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0= +github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= +github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/mattn/go-runewidth v0.0.3 h1:a+kO+98RDGEfo6asOGMmpodZq4FNtnGP54yps8BzLR4= +github.com/mattn/go-runewidth v0.0.3/go.mod h1:LwmH8dsx7+W8Uxz3IHJYH5QSwggIsqBzpuz5H//U1FU= +github.com/peterh/liner v1.1.0 h1:f+aAedNJA6uk7+6rXsYBnhdo4Xux7ESLe+kcuVUF5os= +github.com/peterh/liner v1.1.0/go.mod h1:CRroGNssyjTd/qIG2FyxByd2S8JEAZXBl4qUrZf8GS0= +github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= +github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= +github.com/stretchr/testify v1.2.2 h1:bSDNvY7ZPG5RlJ8otE/7V6gMiyenm9RtJ7IUVIAoJ1w= +github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs= +gopkg.in/alecthomas/kingpin.v2 v2.2.6 h1:jMFz6MfLP0/4fUyZle81rXUoxOBFi19VUFKVDOQfozc= +gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw= diff --git a/internal/gofuzz/corpus.go b/internal/gofuzz/corpus.go new file mode 100644 index 0000000..cac3cba --- /dev/null +++ b/internal/gofuzz/corpus.go @@ -0,0 +1,44 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package gofuzz + +import ( + "crypto/sha1" + "fmt" + "io/ioutil" + "os" + "path/filepath" + "runtime" +) + +var prefix string + +func init() { + _, file, _, ok := runtime.Caller(0) + if !ok { + panic("runtime.Caller(0) failed") + } + prefix = filepath.Join(filepath.Dir(file), "workdir", "corpus") + if err := os.MkdirAll(prefix, 0750); err != nil { + panic(err) + } +} + +// AddFileToCorpus adds named Gosh source code fragment to the go-fuzz corpus. +func AddFileToCorpus(name string, data []byte) { + path := filepath.Join(prefix, name) + if err := ioutil.WriteFile(path, data, 0640); err != nil { + panic(err) + } +} + +// AddDataToCorpus adds unnamed Gosh source code fragment to the go-fuzz corpus. +func AddDataToCorpus(prefix string, data []byte) { + name := fmt.Sprintf("%s-%040x.gosh", prefix, sha1.Sum(data)) + AddFileToCorpus(name, data) +} diff --git a/internal/golden/01-fizzbuzz.gosh b/internal/golden/01-fizzbuzz.gosh new file mode 100755 index 0000000..95d28c3 --- /dev/null +++ b/internal/golden/01-fizzbuzz.gosh @@ -0,0 +1,21 @@ +#!/usr/bin/env gosh + +var i = 1 +for i = 1; i <= 100; i++ { + var m3 = (i%3 == 0) + var m5 = (i%5 == 0) + + if (m3 && m5) { + println("FizzBuzz") + continue + } + if (m3) { + println("Fizz") + continue + } + if (m5) { + println("Buzz") + continue + } + println(i) +} diff --git a/internal/golden/01-fizzbuzz.gosh.ast b/internal/golden/01-fizzbuzz.gosh.ast new file mode 100644 index 0000000..9114fa0 --- /dev/null +++ b/internal/golden/01-fizzbuzz.gosh.ast @@ -0,0 +1,446 @@ +(*ast.Program)({ + Statements: ([]ast.Statement) (len=2) { + (*ast.VarStatement)({ + Token: (tokens.Token) { + Offset: (int) 21, + Type: (tokens.Type) (len=3) "VAR", + Literal: (string) (len=3) "var" + }, + Name: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 25, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=1) "i" + }, + Value: (string) (len=1) "i" + }), + Value: (*ast.IntegerLiteral)({ + Token: (tokens.Token) { + Offset: (int) 29, + Type: (tokens.Type) (len=7) "INTEGER", + Literal: (string) (len=1) "1" + }, + Value: (int) 1 + }) + }), + (*ast.ForStatement)({ + Token: (tokens.Token) { + Offset: (int) 31, + Type: (tokens.Type) (len=3) "FOR", + Literal: (string) (len=3) "for" + }, + Init: (*ast.AssignStatement)({ + Token: (tokens.Token) { + Offset: (int) 37, + Type: (tokens.Type) (len=10) "ASSIGNMENT", + Literal: (string) (len=1) "=" + }, + Name: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 35, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=1) "i" + }, + Value: (string) (len=1) "i" + }), + Value: (*ast.IntegerLiteral)({ + Token: (tokens.Token) { + Offset: (int) 39, + Type: (tokens.Type) (len=7) "INTEGER", + Literal: (string) (len=1) "1" + }, + Value: (int) 1 + }) + }), + Cond: (*ast.InfixExpression)({ + Token: (tokens.Token) { + Offset: (int) 44, + Type: (tokens.Type) (len=13) "LESS_OR_EQUAL", + Literal: (string) (len=2) "<=" + }, + Left: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 42, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=1) "i" + }, + Value: (string) (len=1) "i" + }), + Right: (*ast.IntegerLiteral)({ + Token: (tokens.Token) { + Offset: (int) 47, + Type: (tokens.Type) (len=7) "INTEGER", + Literal: (string) (len=3) "100" + }, + Value: (int) 100 + }) + }), + Post: (*ast.IncrementDecrementStatement)({ + Token: (tokens.Token) { + Offset: (int) 53, + Type: (tokens.Type) (len=9) "INCREMENT", + Literal: (string) (len=2) "++" + }, + Name: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 52, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=1) "i" + }, + Value: (string) (len=1) "i" + }) + }), + Body: (*ast.BlockStatement)({ + Token: (tokens.Token) { + Offset: (int) 56, + Type: (tokens.Type) (len=6) "LBRACE", + Literal: (string) (len=1) "{" + }, + Statements: ([]ast.Statement) (len=6) { + (*ast.VarStatement)({ + Token: (tokens.Token) { + Offset: (int) 59, + Type: (tokens.Type) (len=3) "VAR", + Literal: (string) (len=3) "var" + }, + Name: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 63, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=2) "m3" + }, + Value: (string) (len=2) "m3" + }), + Value: (*ast.InfixExpression)({ + Token: (tokens.Token) { + Offset: (int) 73, + Type: (tokens.Type) (len=5) "EQUAL", + Literal: (string) (len=2) "==" + }, + Left: (*ast.InfixExpression)({ + Token: (tokens.Token) { + Offset: (int) 70, + Type: (tokens.Type) (len=9) "REMAINDER", + Literal: (string) (len=1) "%" + }, + Left: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 69, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=1) "i" + }, + Value: (string) (len=1) "i" + }), + Right: (*ast.IntegerLiteral)({ + Token: (tokens.Token) { + Offset: (int) 71, + Type: (tokens.Type) (len=7) "INTEGER", + Literal: (string) (len=1) "3" + }, + Value: (int) 3 + }) + }), + Right: (*ast.IntegerLiteral)({ + Token: (tokens.Token) { + Offset: (int) 76, + Type: (tokens.Type) (len=7) "INTEGER", + Literal: (string) (len=1) "0" + }, + Value: (int) 0 + }) + }) + }), + (*ast.VarStatement)({ + Token: (tokens.Token) { + Offset: (int) 80, + Type: (tokens.Type) (len=3) "VAR", + Literal: (string) (len=3) "var" + }, + Name: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 84, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=2) "m5" + }, + Value: (string) (len=2) "m5" + }), + Value: (*ast.InfixExpression)({ + Token: (tokens.Token) { + Offset: (int) 94, + Type: (tokens.Type) (len=5) "EQUAL", + Literal: (string) (len=2) "==" + }, + Left: (*ast.InfixExpression)({ + Token: (tokens.Token) { + Offset: (int) 91, + Type: (tokens.Type) (len=9) "REMAINDER", + Literal: (string) (len=1) "%" + }, + Left: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 90, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=1) "i" + }, + Value: (string) (len=1) "i" + }), + Right: (*ast.IntegerLiteral)({ + Token: (tokens.Token) { + Offset: (int) 92, + Type: (tokens.Type) (len=7) "INTEGER", + Literal: (string) (len=1) "5" + }, + Value: (int) 5 + }) + }), + Right: (*ast.IntegerLiteral)({ + Token: (tokens.Token) { + Offset: (int) 97, + Type: (tokens.Type) (len=7) "INTEGER", + Literal: (string) (len=1) "0" + }, + Value: (int) 0 + }) + }) + }), + (*ast.IfStatement)({ + Token: (tokens.Token) { + Offset: (int) 102, + Type: (tokens.Type) (len=2) "IF", + Literal: (string) (len=2) "if" + }, + Cond: (*ast.InfixExpression)({ + Token: (tokens.Token) { + Offset: (int) 109, + Type: (tokens.Type) (len=11) "LOGICAL_AND", + Literal: (string) (len=2) "&&" + }, + Left: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 106, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=2) "m3" + }, + Value: (string) (len=2) "m3" + }), + Right: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 112, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=2) "m5" + }, + Value: (string) (len=2) "m5" + }) + }), + Body: (*ast.BlockStatement)({ + Token: (tokens.Token) { + Offset: (int) 116, + Type: (tokens.Type) (len=6) "LBRACE", + Literal: (string) (len=1) "{" + }, + Statements: ([]ast.Statement) (len=2) { + (*ast.ExpressionStatement)({ + Token: (tokens.Token) { + Offset: (int) 120, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Expression: (*ast.CallExpression)({ + Token: (tokens.Token) { + Offset: (int) 127, + Type: (tokens.Type) (len=6) "LPAREN", + Literal: (string) (len=1) "(" + }, + Function: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 120, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Value: (string) (len=7) "println" + }), + Arguments: ([]ast.Expression) (len=1) { + (*ast.StringLiteral)({ + Token: (tokens.Token) { + Offset: (int) 128, + Type: (tokens.Type) (len=6) "STRING", + Literal: (string) (len=10) "\"FizzBuzz\"" + }, + Value: (string) (len=8) "FizzBuzz" + }) + } + }) + }), + (*ast.ContinueStatement)({ + Token: (tokens.Token) { + Offset: (int) 142, + Type: (tokens.Type) (len=8) "CONTINUE", + Literal: (string) (len=8) "continue" + } + }) + } + }) + }), + (*ast.IfStatement)({ + Token: (tokens.Token) { + Offset: (int) 155, + Type: (tokens.Type) (len=2) "IF", + Literal: (string) (len=2) "if" + }, + Cond: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 159, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=2) "m3" + }, + Value: (string) (len=2) "m3" + }), + Body: (*ast.BlockStatement)({ + Token: (tokens.Token) { + Offset: (int) 163, + Type: (tokens.Type) (len=6) "LBRACE", + Literal: (string) (len=1) "{" + }, + Statements: ([]ast.Statement) (len=2) { + (*ast.ExpressionStatement)({ + Token: (tokens.Token) { + Offset: (int) 167, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Expression: (*ast.CallExpression)({ + Token: (tokens.Token) { + Offset: (int) 174, + Type: (tokens.Type) (len=6) "LPAREN", + Literal: (string) (len=1) "(" + }, + Function: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 167, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Value: (string) (len=7) "println" + }), + Arguments: ([]ast.Expression) (len=1) { + (*ast.StringLiteral)({ + Token: (tokens.Token) { + Offset: (int) 175, + Type: (tokens.Type) (len=6) "STRING", + Literal: (string) (len=6) "\"Fizz\"" + }, + Value: (string) (len=4) "Fizz" + }) + } + }) + }), + (*ast.ContinueStatement)({ + Token: (tokens.Token) { + Offset: (int) 185, + Type: (tokens.Type) (len=8) "CONTINUE", + Literal: (string) (len=8) "continue" + } + }) + } + }) + }), + (*ast.IfStatement)({ + Token: (tokens.Token) { + Offset: (int) 198, + Type: (tokens.Type) (len=2) "IF", + Literal: (string) (len=2) "if" + }, + Cond: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 202, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=2) "m5" + }, + Value: (string) (len=2) "m5" + }), + Body: (*ast.BlockStatement)({ + Token: (tokens.Token) { + Offset: (int) 206, + Type: (tokens.Type) (len=6) "LBRACE", + Literal: (string) (len=1) "{" + }, + Statements: ([]ast.Statement) (len=2) { + (*ast.ExpressionStatement)({ + Token: (tokens.Token) { + Offset: (int) 210, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Expression: (*ast.CallExpression)({ + Token: (tokens.Token) { + Offset: (int) 217, + Type: (tokens.Type) (len=6) "LPAREN", + Literal: (string) (len=1) "(" + }, + Function: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 210, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Value: (string) (len=7) "println" + }), + Arguments: ([]ast.Expression) (len=1) { + (*ast.StringLiteral)({ + Token: (tokens.Token) { + Offset: (int) 218, + Type: (tokens.Type) (len=6) "STRING", + Literal: (string) (len=6) "\"Buzz\"" + }, + Value: (string) (len=4) "Buzz" + }) + } + }) + }), + (*ast.ContinueStatement)({ + Token: (tokens.Token) { + Offset: (int) 228, + Type: (tokens.Type) (len=8) "CONTINUE", + Literal: (string) (len=8) "continue" + } + }) + } + }) + }), + (*ast.ExpressionStatement)({ + Token: (tokens.Token) { + Offset: (int) 241, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Expression: (*ast.CallExpression)({ + Token: (tokens.Token) { + Offset: (int) 248, + Type: (tokens.Type) (len=6) "LPAREN", + Literal: (string) (len=1) "(" + }, + Function: (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 241, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=7) "println" + }, + Value: (string) (len=7) "println" + }), + Arguments: ([]ast.Expression) (len=1) { + (*ast.Identifier)({ + Token: (tokens.Token) { + Offset: (int) 249, + Type: (tokens.Type) (len=10) "IDENTIFIER", + Literal: (string) (len=1) "i" + }, + Value: (string) (len=1) "i" + }) + } + }) + }) + } + }) + }) + } +}) diff --git a/internal/golden/01-fizzbuzz.gosh.output b/internal/golden/01-fizzbuzz.gosh.output new file mode 100644 index 0000000..dd96488 --- /dev/null +++ b/internal/golden/01-fizzbuzz.gosh.output @@ -0,0 +1,100 @@ +1 +2 +Fizz +4 +Buzz +Fizz +7 +8 +Fizz +Buzz +11 +Fizz +13 +14 +FizzBuzz +16 +17 +Fizz +19 +Buzz +Fizz +22 +23 +Fizz +Buzz +26 +Fizz +28 +29 +FizzBuzz +31 +32 +Fizz +34 +Buzz +Fizz +37 +38 +Fizz +Buzz +41 +Fizz +43 +44 +FizzBuzz +46 +47 +Fizz +49 +Buzz +Fizz +52 +53 +Fizz +Buzz +56 +Fizz +58 +59 +FizzBuzz +61 +62 +Fizz +64 +Buzz +Fizz +67 +68 +Fizz +Buzz +71 +Fizz +73 +74 +FizzBuzz +76 +77 +Fizz +79 +Buzz +Fizz +82 +83 +Fizz +Buzz +86 +Fizz +88 +89 +FizzBuzz +91 +92 +Fizz +94 +Buzz +Fizz +97 +98 +Fizz +Buzz diff --git a/internal/golden/01-fizzbuzz.gosh.text b/internal/golden/01-fizzbuzz.gosh.text new file mode 100644 index 0000000..c5b8898 --- /dev/null +++ b/internal/golden/01-fizzbuzz.gosh.text @@ -0,0 +1,18 @@ +var i = 1; +for i = 1; i <= 100; i++ { +var m3 = i % 3 == 0; +var m5 = i % 5 == 0; +if (m3 && m5) { +println("FizzBuzz"); +continue; +}; +if (m3) { +println("Fizz"); +continue; +}; +if (m5) { +println("Buzz"); +continue; +}; +println(i); +}; diff --git a/internal/golden/01-fizzbuzz.gosh.tokens b/internal/golden/01-fizzbuzz.gosh.tokens new file mode 100644 index 0000000..3e8fee5 --- /dev/null +++ b/internal/golden/01-fizzbuzz.gosh.tokens @@ -0,0 +1,91 @@ +[ 21: VAR var ] +[ 25: IDENTIFIER i ] +[ 27: ASSIGNMENT = ] +[ 29: INTEGER 1 ] +[ 30: SEMICOLON newline ] +[ 31: FOR for ] +[ 35: IDENTIFIER i ] +[ 37: ASSIGNMENT = ] +[ 39: INTEGER 1 ] +[ 40: SEMICOLON ; ] +[ 42: IDENTIFIER i ] +[ 44: LESS_OR_EQUAL <= ] +[ 47: INTEGER 100 ] +[ 50: SEMICOLON ; ] +[ 52: IDENTIFIER i ] +[ 53: INCREMENT ++ ] +[ 56: LBRACE { ] +[ 59: VAR var ] +[ 63: IDENTIFIER m3 ] +[ 66: ASSIGNMENT = ] +[ 68: LPAREN ( ] +[ 69: IDENTIFIER i ] +[ 70: REMAINDER % ] +[ 71: INTEGER 3 ] +[ 73: EQUAL == ] +[ 76: INTEGER 0 ] +[ 77: RPAREN ) ] +[ 78: SEMICOLON newline ] +[ 80: VAR var ] +[ 84: IDENTIFIER m5 ] +[ 87: ASSIGNMENT = ] +[ 89: LPAREN ( ] +[ 90: IDENTIFIER i ] +[ 91: REMAINDER % ] +[ 92: INTEGER 5 ] +[ 94: EQUAL == ] +[ 97: INTEGER 0 ] +[ 98: RPAREN ) ] +[ 99: SEMICOLON newline ] +[ 102: IF if ] +[ 105: LPAREN ( ] +[ 106: IDENTIFIER m3 ] +[ 109: LOGICAL_AND && ] +[ 112: IDENTIFIER m5 ] +[ 114: RPAREN ) ] +[ 116: LBRACE { ] +[ 120: IDENTIFIER println ] +[ 127: LPAREN ( ] +[ 128: STRING "FizzBuzz" ] +[ 138: RPAREN ) ] +[ 139: SEMICOLON newline ] +[ 142: CONTINUE continue ] +[ 150: SEMICOLON newline ] +[ 152: RBRACE } ] +[ 153: SEMICOLON newline ] +[ 155: IF if ] +[ 158: LPAREN ( ] +[ 159: IDENTIFIER m3 ] +[ 161: RPAREN ) ] +[ 163: LBRACE { ] +[ 167: IDENTIFIER println ] +[ 174: LPAREN ( ] +[ 175: STRING "Fizz" ] +[ 181: RPAREN ) ] +[ 182: SEMICOLON newline ] +[ 185: CONTINUE continue ] +[ 193: SEMICOLON newline ] +[ 195: RBRACE } ] +[ 196: SEMICOLON newline ] +[ 198: IF if ] +[ 201: LPAREN ( ] +[ 202: IDENTIFIER m5 ] +[ 204: RPAREN ) ] +[ 206: LBRACE { ] +[ 210: IDENTIFIER println ] +[ 217: LPAREN ( ] +[ 218: STRING "Buzz" ] +[ 224: RPAREN ) ] +[ 225: SEMICOLON newline ] +[ 228: CONTINUE continue ] +[ 236: SEMICOLON newline ] +[ 238: RBRACE } ] +[ 239: SEMICOLON newline ] +[ 241: IDENTIFIER println ] +[ 248: LPAREN ( ] +[ 249: IDENTIFIER i ] +[ 250: RPAREN ) ] +[ 251: SEMICOLON newline ] +[ 252: RBRACE } ] +[ 253: SEMICOLON newline ] +[ 254: EOF ] diff --git a/internal/golden/02-fibonacci.gosh.skip b/internal/golden/02-fibonacci.gosh.skip new file mode 100755 index 0000000..6c67732 --- /dev/null +++ b/internal/golden/02-fibonacci.gosh.skip @@ -0,0 +1,12 @@ +func fibonacci(x int) { + switch x { + case 0: + return 0 + case 1: + return 1 + default: + return fibonacci(x-1) + fibonacci(x-2) + } +} + +println(fibonacci(42)) diff --git a/internal/golden/better-fizzbuzz.gosh.skip b/internal/golden/better-fizzbuzz.gosh.skip new file mode 100644 index 0000000..2f0d157 --- /dev/null +++ b/internal/golden/better-fizzbuzz.gosh.skip @@ -0,0 +1,23 @@ +#!/usr/bin/env gosh + +// Write a program that prints the numbers from 1 to 100. +// But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. +// For numbers which are multiples of both three and five print “FizzBuzz”. + +var i = 1 // TODO remove +for i = 1; i <= 100; i++ { + // TODO replace with := + var m3 = i%3 == 0 + var m5 = i%5 == 0 + + switch { + case m3 && m5: + println("FizzBuzz") + case m3: + println("Fizz") + case m5: + println("Buzz") + default: + println(i) + } +} diff --git a/internal/golden/golden.go b/internal/golden/golden.go new file mode 100644 index 0000000..ac26632 --- /dev/null +++ b/internal/golden/golden.go @@ -0,0 +1,85 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package golden + +import ( + "fmt" + "io/ioutil" + "path/filepath" + "runtime" + "strings" + + "gosh-lang.org/gosh/internal/gofuzz" +) + +// File describes a single golden file. +type File struct { + File string + Source string + Tokens string + AST []string + Text []string + Output []string +} + +// Data is a golden data shared by all golden tests. +var Data []File + +func init() { + _, file, _, ok := runtime.Caller(0) + if !ok { + panic("runtime.Caller(0) failed") + } + files, err := filepath.Glob(filepath.Join(filepath.Dir(file), "*.gosh")) + if err != nil { + panic(err) + } + for _, f := range files { + source, err := ioutil.ReadFile(f) //nolint:gas + if err != nil { + panic(err) + } + + gofuzz.AddFileToCorpus(filepath.Base(f), source) + + tokens, err := ioutil.ReadFile(f + ".tokens") + if err != nil { + panic(err) + } + + ast, err := ioutil.ReadFile(f + ".ast") + if err != nil { + panic(err) + } + + text, err := ioutil.ReadFile(f + ".text") + if err != nil { + panic(err) + } + + output, err := ioutil.ReadFile(f + ".output") + if err != nil { + panic(err) + } + + Data = append(Data, File{ + File: filepath.Base(f), + Source: string(source), + Tokens: strings.TrimSpace(string(tokens)), + AST: strings.Split(string(ast), "\n"), + Text: strings.Split(string(text), "\n"), + Output: strings.Split(string(output), "\n"), + }) + } + + // double check + const expected = 1 + if len(Data) != expected { + panic(fmt.Sprintf("expected %d files, read %d", expected, len(Data))) + } +} diff --git a/interpreter/benchmarks_test.go b/interpreter/benchmarks_test.go new file mode 100644 index 0000000..2023777 --- /dev/null +++ b/interpreter/benchmarks_test.go @@ -0,0 +1,62 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package interpreter + +import ( + "context" + "io/ioutil" + "testing" + + "github.com/stretchr/testify/require" + + "gosh-lang.org/gosh/objects" + "gosh-lang.org/gosh/parser" + "gosh-lang.org/gosh/scanner" +) + +var sink interface{} + +func BenchmarkEval(b *testing.B) { + input := ` + var i = 1 + for i = 1; i <= 100; i++ { + var m3 = (i%3 == 0) + var m5 = (i%5 == 0) + + if (m3 && m5) { + println("FizzBuzz") + continue + } + if (m3) { + println("Fizz") + continue + } + if (m5) { + println("Buzz") + continue + } + println(i) + }` + + s, err := scanner.New(input, nil) + require.NoError(b, err) + + p := parser.New(s, nil) + program := p.ParseProgram() + require.Nil(b, p.Errors()) + require.NotNil(b, program) + + i := New(nil) + scope := objects.Builtin(ioutil.Discard) + + b.ReportAllocs() + b.ResetTimer() + for n := 0; n < b.N; n++ { + sink = i.Eval(context.Background(), program, scope) + } +} diff --git a/interpreter/interpreter.go b/interpreter/interpreter.go new file mode 100644 index 0000000..b7876d5 --- /dev/null +++ b/interpreter/interpreter.go @@ -0,0 +1,335 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package interpreter + +import ( + "context" + "fmt" + + "gosh-lang.org/gosh/ast" + "gosh-lang.org/gosh/objects" + "gosh-lang.org/gosh/tokens" +) + +// Interpreter evaluates Gosh AST nodes. +type Interpreter struct { + config *Config +} + +// Config configures interpreter. +type Config struct { +} + +// New creates a new interpreter. +func New(config *Config) *Interpreter { + if config == nil { + config = new(Config) + } + + return &Interpreter{ + config: config, + } +} + +func (i *Interpreter) crash(format string, a ...interface{}) { + msg := fmt.Sprintf(format, a...) + panic(msg) +} + +// Eval evaluates given node in the given scope. +func (i *Interpreter) Eval(ctx context.Context, node ast.Node, scope *objects.Scope) objects.Object { + if ctx.Err() != nil { + // FIXME return error + return nil + } + + switch node := node.(type) { + case *ast.Program: + var res objects.Object + for _, s := range node.Statements { + res = i.Eval(ctx, s, scope) + } + return res + + case *ast.BlockStatement: + var res objects.Object + for _, s := range node.Statements { + res = i.Eval(ctx, s, scope) + if res != nil { + switch res.Type() { + case objects.CONTINUE: + return res + } + } + } + return res + + case *ast.ExpressionStatement: + return i.Eval(ctx, node.Expression, scope) + + case *ast.ReturnStatement: + return i.Eval(ctx, node.Value, scope) + + case *ast.VarStatement: + val := i.Eval(ctx, node.Value, scope) + scope.Set(node.Name.Value, val) + return nil + + case *ast.AssignStatement: + return i.evalAssignStatement(ctx, node, scope) + + case *ast.ForStatement: + return i.evalForStatement(ctx, node, scope) + + case *ast.IfStatement: + return i.evalIfStatement(ctx, node, scope) + + case *ast.IncrementDecrementStatement: + return i.evalIncrementDecrementStatement(node, scope) + + case *ast.ContinueStatement: + return &objects.Continue{} + + case *ast.Identifier: + val, ok := scope.Lookup(node.Value) + if !ok { + i.crash("identifier not found: %s", node.Value) + } + return val + + case *ast.PrefixExpression: + right := i.Eval(ctx, node.Right, scope) + return i.evalPrefixExpression(node.Token.Literal, right) + + case *ast.InfixExpression: + left := i.Eval(ctx, node.Left, scope) + right := i.Eval(ctx, node.Right, scope) + return i.evalInfixExpression(node.Token.Literal, left, right) + + case *ast.IntegerLiteral: + return &objects.Integer{Value: node.Value} + + case *ast.BooleanLiteral: + return &objects.Boolean{Value: node.Value} + + case *ast.StringLiteral: + return &objects.String{Value: node.Value} + + case *ast.FunctionLiteral: + return &objects.Function{ + Parameters: node.Parameters, + Body: node.Body, + Scope: scope, + } + + case *ast.CallExpression: + return i.evalCallExpression(ctx, node, scope) + + default: + i.crash("unexpected node %T:\n%#v", node, node) + panic("not reached") + } +} + +func (i *Interpreter) evalPrefixExpression(operator string, right objects.Object) objects.Object { + switch operator { + case "!": + if b, ok := right.(*objects.Boolean); ok { + return &objects.Boolean{Value: !b.Value} + } + i.crash("prefix expression operator ! on %T:\n%#v", right, right) + + case "-": + if i, ok := right.(*objects.Integer); ok { + return &objects.Integer{Value: -i.Value} + } + i.crash("prefix expression operator - on %T:\n%#v", right, right) + + default: + i.crash("unhandled prefix expression operator %s", operator) + } + panic("not reached") +} + +func (i *Interpreter) evalInfixIntegerExpression(operator string, left, right int) objects.Object { + switch operator { + case "+": + return &objects.Integer{Value: left + right} + case "-": + return &objects.Integer{Value: left - right} + case "*": + return &objects.Integer{Value: left * right} + case "/": + return &objects.Integer{Value: left / right} + case "%": + return &objects.Integer{Value: left % right} + + case "<": + return &objects.Boolean{Value: left < right} + case "<=": + return &objects.Boolean{Value: left <= right} + case ">": + return &objects.Boolean{Value: left > right} + case ">=": + return &objects.Boolean{Value: left >= right} + case "==": + return &objects.Boolean{Value: left == right} + case "!=": + return &objects.Boolean{Value: left != right} + + default: + i.crash("unhandled infix expression operator %s for two Integers", operator) + panic("not reached") + } +} + +func (i *Interpreter) evalInfixBooleanExpression(operator string, left, right bool) objects.Object { + switch operator { + case "==": + return &objects.Boolean{Value: left == right} + case "!=": + return &objects.Boolean{Value: left != right} + case "&&": + return &objects.Boolean{Value: left && right} + case "||": + return &objects.Boolean{Value: left || right} + default: + i.crash("unhandled infix expression operator %s for two Booleans", operator) + panic("not reached") + } +} + +func (i *Interpreter) evalInfixExpression(operator string, left, right objects.Object) objects.Object { + switch left.Type() { + case objects.INTEGER: + switch right.Type() { + case objects.INTEGER: + l := left.(*objects.Integer).Value + r := right.(*objects.Integer).Value + return i.evalInfixIntegerExpression(operator, l, r) + } + + case objects.BOOLEAN: + switch right.Type() { + case objects.BOOLEAN: + l := left.(*objects.Boolean).Value + r := right.(*objects.Boolean).Value + return i.evalInfixBooleanExpression(operator, l, r) + } + } + + i.crash("unhandled combination: %T %s %T", left, operator, right) + panic("not reached") +} + +func (i *Interpreter) evalExpressions(ctx context.Context, exps []ast.Expression, scope *objects.Scope) []objects.Object { + res := make([]objects.Object, len(exps)) + for n, e := range exps { + res[n] = i.Eval(ctx, e, scope) + } + return res +} + +func (i *Interpreter) evalAssignStatement(ctx context.Context, node *ast.AssignStatement, scope *objects.Scope) objects.Object { + val := i.Eval(ctx, node.Value, scope) + switch node.Token.Type { + case tokens.Assignment: + // nothing + default: + i.crash("unhandled token %s", node.Token) + } + scope.Set(node.Name.Value, val) + return nil +} + +func (i *Interpreter) evalForStatement(ctx context.Context, node *ast.ForStatement, scope *objects.Scope) objects.Object { + i.Eval(ctx, node.Init, scope) + for { + cond := i.Eval(ctx, node.Cond, scope) + var b *objects.Boolean + var ok bool + if b, ok = cond.(*objects.Boolean); !ok { + i.crash("expected boolean, got %T %s", cond, cond) + } + if !b.Value { + return nil + } + + body := i.Eval(ctx, node.Body, scope) + + i.Eval(ctx, node.Post, scope) + + if body != nil { + switch body.Type() { + case objects.CONTINUE: + continue + } + } + } +} + +func (i *Interpreter) evalIfStatement(ctx context.Context, node *ast.IfStatement, scope *objects.Scope) objects.Object { + cond := i.Eval(ctx, node.Cond, scope) + var b *objects.Boolean + var ok bool + if b, ok = cond.(*objects.Boolean); !ok { + i.crash("expected boolean, got %T %s", cond, cond) + } + if !b.Value { + return nil + } + + body := i.Eval(ctx, node.Body, scope) + if body != nil { + switch body.Type() { + case objects.CONTINUE: + return body + } + } + return nil +} + +func (i *Interpreter) evalIncrementDecrementStatement(node *ast.IncrementDecrementStatement, scope *objects.Scope) objects.Object { + name := node.Name.Value + val, ok := scope.Lookup(name) + if !ok { + i.crash("failed to lookup %s", name) + } + + v := val.(*objects.Integer).Value + + switch node.Token.Type { + case tokens.Increment: + v++ + case tokens.Decrement: + v-- + default: + i.crash("unexpected token") + } + + scope.Set(name, &objects.Integer{Value: v}) + return nil +} + +func (i *Interpreter) evalCallExpression(ctx context.Context, node *ast.CallExpression, scope *objects.Scope) objects.Object { + f := i.Eval(ctx, node.Function, scope) + args := i.evalExpressions(ctx, node.Arguments, scope) + switch f := f.(type) { + case *objects.Function: + newScope := objects.NewScope(scope) + for i, name := range f.Parameters { + newScope.Set(name.Value, args[i]) + } + return i.Eval(ctx, f.Body, newScope) + case *objects.GoFunction: + return f.Func(args...) + default: + i.crash("unexpected node %T:\n%#v", node, node) + panic("not reached") + } +} diff --git a/interpreter/interpreter_fuzz.go b/interpreter/interpreter_fuzz.go new file mode 100644 index 0000000..d80157a --- /dev/null +++ b/interpreter/interpreter_fuzz.go @@ -0,0 +1,34 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// +build gofuzz + +package interpreter + +import ( + "context" + "io/ioutil" + + "gosh-lang.org/gosh/objects" + "gosh-lang.org/gosh/parser" + "gosh-lang.org/gosh/scanner" +) + +func Fuzz(data []byte) int { + s, err := scanner.New(string(data), &scanner.Config{ + SkipShebang: true, + }) + if err != nil { + return 0 + } + + p := parser.New(s, nil) + program := p.ParseProgram() + i := New(nil) + i.Eval(context.TODO(), program, objects.NewScope(objects.Builtin(ioutil.Discard))) + return 0 +} diff --git a/interpreter/interpreter_test.go b/interpreter/interpreter_test.go new file mode 100644 index 0000000..fc6204d --- /dev/null +++ b/interpreter/interpreter_test.go @@ -0,0 +1,127 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package interpreter + +import ( + "bytes" + "context" + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "gosh-lang.org/gosh/internal/gofuzz" + "gosh-lang.org/gosh/internal/golden" + "gosh-lang.org/gosh/objects" + "gosh-lang.org/gosh/parser" + "gosh-lang.org/gosh/scanner" +) + +func TestGolden(t *testing.T) { + for _, f := range golden.Data { + t.Run(f.File, func(t *testing.T) { + s, err := scanner.New(f.Source, &scanner.Config{ + SkipShebang: true, + }) + require.NoError(t, err) + p := parser.New(s, nil) + program := p.ParseProgram() + var buf bytes.Buffer + i := New(nil) + res := i.Eval(context.Background(), program, objects.NewScope(objects.Builtin(&buf))) + t.Log(res) + assert.Equal(t, f.Output, strings.Split(buf.String(), "\n")) + }) + } +} + +func eval(t *testing.T, input string) (objects.Object, *bytes.Buffer) { + t.Helper() + + s, err := scanner.New(input, nil) + require.NoError(t, err) + + p := parser.New(s, nil) + program := p.ParseProgram() + require.Nil(t, p.Errors(), "%s", p.Errors()) + require.NotNil(t, program) + + i := New(nil) + var buf bytes.Buffer + res := i.Eval(context.Background(), program, objects.NewScope(objects.Builtin(&buf))) + return res, &buf +} + +func TestInfixExpression(t *testing.T) { + for input, expected := range map[string]bool{ + `7 < 42`: true, + `7 <= 42`: true, + `7 > 42`: false, + `7 >= 42`: false, + `7 == 42`: false, + `7 != 42`: true, + } { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("interpreter", []byte(input)) + + actual, res := eval(t, input) + require.IsType(t, &objects.Boolean{}, actual) + assert.Equal(t, expected, actual.(*objects.Boolean).Value) + assert.Empty(t, res.String()) + }) + } + + for input, expected := range map[string]int{ + `42 + 7`: 49, + `42 - 7`: 35, + `42 * 7`: 294, + `42 / 7`: 6, + `42 % 7`: 0, + } { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("interpreter", []byte(input)) + + actual, res := eval(t, input) + require.IsType(t, &objects.Integer{}, actual) + assert.Equal(t, expected, actual.(*objects.Integer).Value) + assert.Empty(t, res.String()) + }) + } +} + +func TestLen(t *testing.T) { + for input, expected := range map[string]int{ + `len("FizzBuzz")`: 8, + } { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("interpreter", []byte(input)) + + actual, res := eval(t, input) + require.IsType(t, &objects.Integer{}, actual) + assert.Equal(t, expected, actual.(*objects.Integer).Value) + assert.Empty(t, res.String()) + }) + } +} + +func TestIf(t *testing.T) { + for input, output := range map[string]string{ + `if (true) { print(true) }`: "true", + `if (false && true) { print(true) }`: "", + `if (true && false) { print(true) }`: "", + } { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("interpreter", []byte(input)) + + res, buf := eval(t, input) + assert.Nil(t, res) + assert.Equal(t, output, buf.String()) + }) + } +} diff --git a/main.go b/main.go new file mode 100644 index 0000000..4b96bd4 --- /dev/null +++ b/main.go @@ -0,0 +1,195 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package main + +import ( + "context" + "fmt" + "io" + "io/ioutil" + "log" + "os" + "os/exec" + "os/user" + "path/filepath" + "regexp" + "runtime" + + "github.com/davecgh/go-spew/spew" + "github.com/peterh/liner" + "gopkg.in/alecthomas/kingpin.v2" + + "gosh-lang.org/gosh/interpreter" + "gosh-lang.org/gosh/objects" + "gosh-lang.org/gosh/parser" + "gosh-lang.org/gosh/scanner" + "gosh-lang.org/gosh/tokens" +) + +var ( + // Version of the interpreter. + Version = "0.0.1-dev" +) + +// flags +var ( + DebugScannerF *bool + DebugASTF *bool + DebugParserF *bool +) + +var versionRE = regexp.MustCompile(`go(\S+)`) + +func extractGoVersion(s string) string { + res := versionRE.FindStringSubmatch(s) + if len(res) == 2 { + return res[1] + } + return "" +} + +func goVersion() string { + cmd := exec.Command("go", "version") //nolint:gas + b, err := cmd.CombinedOutput() + if err != nil { + return "" + } + return string(b) +} + +func eval(line string, scope *objects.Scope) { + s, err := scanner.New(line, &scanner.Config{ + SkipShebang: true, + }) + if err != nil { + log.Printf("Scanner error: %s.", err) + return + } + if *DebugScannerF { + log.Print("Tokens:") + for { + t := s.NextToken() + log.Print(t) + switch t.Type { + case tokens.EOF, tokens.Illegal: + return + } + } + } + + p := parser.New(s, nil) + program := p.ParseProgram() + if len(p.Errors()) != 0 { + log.Print("Parser errors:\n") + for _, e := range p.Errors() { + log.Printf("\t%s", e) + } + return + } + if *DebugASTF { + cfg := &spew.ConfigState{ + Indent: " ", + DisableMethods: true, + DisablePointerMethods: true, + DisablePointerAddresses: true, + DisableCapacities: true, + ContinueOnMethod: true, + } + b := cfg.Sdump(program) + log.Printf("AST:\n%s", b) + return + } + if *DebugParserF { + log.Printf("Parsed program:\n%s", program.String()) + return + } + + i := interpreter.New(nil) + res := i.Eval(context.TODO(), program, scope) + if res != nil { + fmt.Println(res.String()) + } +} + +func runREPL() { + fmt.Printf("Gosh v%s. https://gosh-lang.org/\n", Version) + fmt.Printf("Built with Go v%s.\n", extractGoVersion(runtime.Version())) + fmt.Printf("Runtime Go v%s.\n", extractGoVersion(goVersion())) + + liner := liner.NewLiner() + + var historyFilename string + u, err := user.Current() + if err == nil && u.HomeDir != "" { + historyFilename = filepath.Join(u.HomeDir, ".gosh_history") + } + if historyFilename != "" { + f, err := os.Open(historyFilename) + switch { + case err == nil: + liner.ReadHistory(f) + f.Close() + case os.IsNotExist(err): + // nothing + default: + log.Printf("Warning: failed to read history file %s.", historyFilename) + } + } + + defer func() { + fmt.Println() + + os.Create(historyFilename) + + if err := liner.Close(); err != nil { + log.Print(err) + } + }() + + scope := objects.NewScope(objects.Builtin(os.Stdout)) + for { + line, err := liner.Prompt(`\ʕ•ϖ•ʔ/ >> `) + switch err { + case nil: + liner.AppendHistory(line) + eval(line, scope) + case io.EOF: + return + default: + log.Fatal(err) + } + } +} + +func evalFile(filename string) { + b, err := ioutil.ReadFile(filename) + if err != nil { + log.Fatal(err) + } + + scope := objects.NewScope(objects.Builtin(os.Stdout)) + eval(string(b), scope) +} + +func main() { + log.SetFlags(0) + + DebugScannerF = kingpin.Flag("debug-scanner", "Print tokens and exit.").Bool() + DebugASTF = kingpin.Flag("debug-ast", "Print AST and exit.").Bool() + DebugParserF = kingpin.Flag("debug-parser", "Print parsed program and exit.").Bool() + fileArg := kingpin.Arg("file", "Gosh program file.").String() + kingpin.CommandLine.HelpFlag.Short('h') + kingpin.Parse() + + switch *fileArg { + case "": + runREPL() + default: + evalFile(*fileArg) + } +} diff --git a/misc/check_license.go b/misc/check_license.go new file mode 100644 index 0000000..b960240 --- /dev/null +++ b/misc/check_license.go @@ -0,0 +1,114 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// +build ignore + +// check_license checks that MPL license header in all files matches header in this file. +package main + +import ( + "bufio" + "flag" + "fmt" + "io" + "log" + "os" + "path/filepath" + "regexp" + "runtime" +) + +func getHeader() string { + _, file, _, ok := runtime.Caller(0) + if !ok { + panic("runtime.Caller(0) failed") + } + f, err := os.Open(file) + if err != nil { + log.Fatal(err) + } + defer f.Close() + + var header string + s := bufio.NewScanner(f) + for s.Scan() { + if s.Text() == "" { + break + } + header += s.Text() + "\n" + } + header += "\n" + if err := s.Err(); err != nil { + log.Fatal(err) + } + return header +} + +var generatedHeader = regexp.MustCompile(`^// Code generated .* DO NOT EDIT\.`) + +func checkHeader(path string, header string) bool { + f, err := os.Open(path) + if err != nil { + log.Fatal(err) + } + defer f.Close() + + actual := make([]byte, len(header)) + if _, err = io.ReadFull(f, actual); err != nil { + log.Printf("%s - %s", path, err) + return false + } + + if generatedHeader.Match(actual) { + return true + } + + if header != string(actual) { + log.Print(path) + return false + } + return true +} + +func main() { + log.SetFlags(0) + flag.Usage = func() { + fmt.Fprintln(flag.CommandLine.Output(), "Usage: go run misc/check_license.go") + flag.CommandLine.PrintDefaults() + } + flag.Parse() + + header := getHeader() + + ok := true + filepath.Walk(".", func(path string, info os.FileInfo, err error) error { + if err != nil { + return err + } + if info.IsDir() { + switch info.Name() { + case ".git", "vendor": + return filepath.SkipDir + default: + return nil + } + } + + if filepath.Ext(info.Name()) == ".go" { + if !checkHeader(path, header) { + ok = false + } + } + return nil + }) + + if ok { + os.Exit(0) + } + log.Print("Please update license header in those files.") + os.Exit(1) +} diff --git a/misc/run_fuzzer.go b/misc/run_fuzzer.go new file mode 100644 index 0000000..1212a87 --- /dev/null +++ b/misc/run_fuzzer.go @@ -0,0 +1,69 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// +build ignore + +// run_fuzzer runs go-fuzz for the given package. +package main + +import ( + "flag" + "fmt" + "log" + "os" + "os/exec" + "path/filepath" + "strings" +) + +func run(args ...string) { + cmd := exec.Command(args[0], args[1:]...) + cmd.Stdout = os.Stdout + cmd.Stderr = os.Stderr + cmd.Env = append(os.Environ(), "GO111MODULE=off") + log.Print(strings.Join(cmd.Args, " ")) + if err := cmd.Run(); err != nil { + log.Fatal(err) + } +} + +func main() { + log.SetFlags(0) + packages := []string{"scanner", "parser", "interpreter"} + flag.Usage = func() { + fmt.Fprintln(flag.CommandLine.Output(), "Usage: go run misc/run_fuzzer.go [package]") + fmt.Fprintln(flag.CommandLine.Output(), " [package] is one of: "+strings.Join(packages, ", ")+".") + flag.CommandLine.PrintDefaults() + } + flag.Parse() + + if flag.NArg() != 1 { + flag.Usage() + os.Exit(1) + } + + pack := flag.Arg(0) + var found bool + for _, p := range packages { + if p == pack { + found = true + break + } + } + if !found { + flag.Usage() + os.Exit(1) + } + + importPath := "gosh-lang.org/gosh/" + pack + file := filepath.Join("internal", "gofuzz", "workdir", pack+"_fuzz.zip") + workdir := filepath.Join("internal", "gofuzz", "workdir") + + run("go", "install", "-v", "-tags", "gofuzz", "./...") + run("go-fuzz-build", "-o="+file, importPath) + run("go-fuzz", "-bin="+file, "-workdir="+workdir) +} diff --git a/objects/builtin.go b/objects/builtin.go new file mode 100644 index 0000000..942ee84 --- /dev/null +++ b/objects/builtin.go @@ -0,0 +1,74 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package objects + +import ( + "fmt" + "io" + "strings" +) + +var ( + lenBuiltin = &GoFunction{Func: func(args ...Object) Object { + if len(args) != 1 { + panic(fmt.Errorf("len: expected 1 argument, got %d", len(args))) + } + arg := args[0] + switch arg := arg.(type) { + case *String: + return &Integer{ + Value: len(arg.Value), + } + default: + panic(fmt.Errorf("len: unexpected argument type %T", arg)) + } + }} + + // TODO append + // TODO cap + // TODO close + // TODO copy + // TODO delete + // TODO make + // TODO new + // TODO panic + // TODO recover +) + +func makePrintBuiltin(stdout io.Writer) *GoFunction { + return &GoFunction{Func: func(args ...Object) Object { + res := make([]string, len(args)) + for i, arg := range args { + res[i] = arg.String() + } + fmt.Fprint(stdout, strings.Join(res, " ")) + return nil + }} +} + +func makePrintlnBuiltin(stdout io.Writer) *GoFunction { + return &GoFunction{Func: func(args ...Object) Object { + res := make([]string, len(args)) + for i, arg := range args { + res[i] = arg.String() + } + fmt.Fprintln(stdout, strings.Join(res, " ")) + return nil + }} +} + +// Builtin returns a Scope of predeclared identifiers. +func Builtin(stdout io.Writer) *Scope { + return &Scope{ + store: map[string]Object{ + "print": makePrintBuiltin(stdout), + "println": makePrintlnBuiltin(stdout), + "len": lenBuiltin, + }, + } +} diff --git a/objects/objects.go b/objects/objects.go new file mode 100644 index 0000000..a39a230 --- /dev/null +++ b/objects/objects.go @@ -0,0 +1,106 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package objects + +import ( + "fmt" + "reflect" + "strconv" + "strings" + + "gosh-lang.org/gosh/ast" +) + +// Object is a common interface for all Gosh runtime objects. +type Object interface { + // Type returns object's type. + Type() Type + fmt.Stringer +} + +// Integer represents integer runtime object. +type Integer struct { + Value int +} + +// Type returns INTEGER. +func (i *Integer) Type() Type { return INTEGER } + +func (i *Integer) String() string { return strconv.Itoa(i.Value) } + +// Boolean represents boolean runtime object. +type Boolean struct { + Value bool +} + +// Type returns BOOLEAN. +func (b *Boolean) Type() Type { return BOOLEAN } + +func (b *Boolean) String() string { return strconv.FormatBool(b.Value) } + +// String represents string runtime object. +type String struct { + Value string +} + +// Type returns STRING. +func (s *String) Type() Type { return STRING } + +func (s *String) String() string { return s.Value } + +// Continue represents continue runtime object. +type Continue struct{} + +// Type returns CONTINUE. +func (c *Continue) Type() Type { return CONTINUE } + +func (c *Continue) String() string { + return "continue" +} + +// Function represents function runtime object. +type Function struct { + Parameters []*ast.Identifier + Body *ast.BlockStatement + Scope *Scope +} + +// Type returns FUNCTION. +func (f *Function) Type() Type { return FUNCTION } + +func (f *Function) String() string { + params := make([]string, len(f.Parameters)) + for i, p := range f.Parameters { + params[i] = p.String() + } + + var res strings.Builder + res.WriteString("func(") + res.WriteString(strings.Join(params, ", ")) + res.WriteString(") ") + res.WriteString(f.Body.String()) + return res.String() +} + +// GoFunction represents Go function. +type GoFunction struct { + Func func(args ...Object) Object +} + +// Type returns GOFUNCTION. +func (gf *GoFunction) Type() Type { return GOFUNCTION } + +func (gf *GoFunction) String() string { return reflect.ValueOf(gf.Func).String() } + +// check interfaces +var ( + _ Object = (*Integer)(nil) + _ Object = (*Boolean)(nil) + _ Object = (*Function)(nil) + _ Object = (*GoFunction)(nil) +) diff --git a/objects/scope.go b/objects/scope.go new file mode 100644 index 0000000..5cae582 --- /dev/null +++ b/objects/scope.go @@ -0,0 +1,37 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package objects + +// A Scope maintains the set of named language entities declared in the scope +// and a link to the immediately surrounding (outer) scope. +type Scope struct { + outer *Scope + store map[string]Object +} + +// NewScope creates a new scope nested in the outer scope. +func NewScope(outer *Scope) *Scope { + return &Scope{ + outer: outer, + store: make(map[string]Object), + } +} + +// Lookup return a named entity with this or outer scope (recursively). +func (e *Scope) Lookup(name string) (Object, bool) { + obj, ok := e.store[name] + if !ok && e.outer != nil { + obj, ok = e.outer.Lookup(name) + } + return obj, ok +} + +// Set adds or replaces a named entity in scope. +func (e *Scope) Set(name string, obj Object) { + e.store[name] = obj +} diff --git a/objects/type.go b/objects/type.go new file mode 100644 index 0000000..97c10a2 --- /dev/null +++ b/objects/type.go @@ -0,0 +1,23 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package objects + +//go:generate stringer -type Type + +// Type is the set of object types of the Gosh programming language. +type Type int + +// The list of object types. +const ( + INTEGER Type = iota + BOOLEAN + STRING + FUNCTION + GOFUNCTION + CONTINUE +) diff --git a/objects/type_string.go b/objects/type_string.go new file mode 100644 index 0000000..8e2378a --- /dev/null +++ b/objects/type_string.go @@ -0,0 +1,16 @@ +// Code generated by "stringer -type Type"; DO NOT EDIT. + +package objects + +import "strconv" + +const _Type_name = "INTEGERBOOLEANSTRINGFUNCTIONGOFUNCTIONCONTINUE" + +var _Type_index = [...]uint8{0, 7, 14, 20, 28, 38, 46} + +func (i Type) String() string { + if i < 0 || i >= Type(len(_Type_index)-1) { + return "Type(" + strconv.FormatInt(int64(i), 10) + ")" + } + return _Type_name[_Type_index[i]:_Type_index[i+1]] +} diff --git a/parser/error.go b/parser/error.go new file mode 100644 index 0000000..fe4add4 --- /dev/null +++ b/parser/error.go @@ -0,0 +1,22 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package parser + +// Error is a parser error. +type Error struct { + Err string +} + +func (e *Error) Error() string { + return e.Err +} + +// check interfaces +var ( + _ error = (*Error)(nil) +) diff --git a/parser/parser.go b/parser/parser.go new file mode 100644 index 0000000..7b49340 --- /dev/null +++ b/parser/parser.go @@ -0,0 +1,695 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// Package parser implements a parser for Gosh source files. +package parser + +import ( + "fmt" + "strconv" + "strings" + + "gosh-lang.org/gosh/ast" + "gosh-lang.org/gosh/scanner" + "gosh-lang.org/gosh/tokens" +) + +// Parser implements parsing of Gosh source files. +type Parser struct { + s *scanner.Scanner + config *Config + errors []error + + curToken tokens.Token + peekToken tokens.Token + + prefixParseFns map[tokens.Type]prefixParseFn + infixParseFns map[tokens.Type]infixParseFn +} + +// Config configures parser. +type Config struct { + crashOnError bool // crash parser on any error, for testing only +} + +// New creates a new parser. +func New(s *scanner.Scanner, config *Config) *Parser { + if config == nil { + config = new(Config) + } + + p := &Parser{ + s: s, + config: config, + prefixParseFns: make(map[tokens.Type]prefixParseFn), + infixParseFns: make(map[tokens.Type]infixParseFn), + } + + // groped just like tokens.Type constants + + for t, f := range map[tokens.Type]prefixParseFn{ + tokens.Comment: p.parseComment, // TODO really?! + + tokens.Integer: p.parseIntegerLiteral, + tokens.String: p.parseStringLiteral, + tokens.Identifier: p.parseIdentifier, + + tokens.Difference: p.parsePrefixExpression, + + tokens.Not: p.parsePrefixExpression, + + tokens.LPAREN: p.parseGroupedExpression, + + tokens.Func: p.parseFunctionLiteral, + + // TODO remove + tokens.True: p.parseBooleanLiteral, + tokens.False: p.parseBooleanLiteral, + } { + p.registerPrefix(t, f) + } + + for t, f := range map[tokens.Type]infixParseFn{ + tokens.Sum: p.parseInfixExpression, + tokens.Difference: p.parseInfixExpression, + tokens.Product: p.parseInfixExpression, + tokens.Quotient: p.parseInfixExpression, + tokens.Remainder: p.parseInfixExpression, + + // TODO AND, OR, XOR + + tokens.LogicalAnd: p.parseInfixExpression, + tokens.LogicalOr: p.parseInfixExpression, + + tokens.Equal: p.parseInfixExpression, + tokens.NotEqual: p.parseInfixExpression, + tokens.Less: p.parseInfixExpression, + tokens.LessOrEqual: p.parseInfixExpression, + tokens.Greater: p.parseInfixExpression, + tokens.GreaterOrEqual: p.parseInfixExpression, + + tokens.LPAREN: p.parseCallExpression, + } { + p.registerInfix(t, f) + } + + // set both curToken and peekToken + p.nextToken() + p.nextToken() + + return p +} + +type ( + prefixParseFn func() ast.Expression + infixParseFn func(ast.Expression) ast.Expression +) + +// List of precedences. +const ( + LowestPrec = 0 + UnaryPrec = 6 // -X, !X + HighestPrec = 7 // foo(X) +) + +var precedences = map[tokens.Type]int{ + // TODO make LowestPrec default, remove those + tokens.Illegal: LowestPrec, + tokens.EOF: LowestPrec, + tokens.Comment: LowestPrec, + tokens.LBRACE: LowestPrec, + tokens.RBRACE: LowestPrec, + tokens.RPAREN: LowestPrec, + tokens.Colon: LowestPrec, + tokens.Identifier: LowestPrec, + tokens.Assignment: LowestPrec, + tokens.Define: LowestPrec, + tokens.If: LowestPrec, + tokens.Else: LowestPrec, + tokens.Return: LowestPrec, + tokens.True: LowestPrec, + tokens.False: LowestPrec, + tokens.Func: LowestPrec, + tokens.Integer: LowestPrec, + tokens.Increment: LowestPrec, + tokens.Switch: LowestPrec, + tokens.Case: LowestPrec, + tokens.Var: LowestPrec, + tokens.For: LowestPrec, + tokens.SumAssignment: LowestPrec, + tokens.DifferenceAssignment: LowestPrec, + tokens.ProductAssignment: LowestPrec, + tokens.QuotientAssignment: LowestPrec, + tokens.RemainderAssignment: LowestPrec, + + tokens.LogicalOr: 1, + + tokens.LogicalAnd: 2, + + tokens.Equal: 3, + tokens.NotEqual: 3, + tokens.Less: 3, + tokens.LessOrEqual: 3, + tokens.Greater: 3, + tokens.GreaterOrEqual: 3, + + tokens.Sum: 4, + tokens.Difference: 4, + tokens.BitwiseOr: 4, + tokens.BitwiseXor: 4, + + tokens.Product: 5, + tokens.Quotient: 5, + tokens.Remainder: 5, + tokens.BitwiseAnd: 5, + + tokens.Not: UnaryPrec, + + tokens.LPAREN: HighestPrec, +} + +func (p *Parser) crash(format string, a ...interface{}) { + msg := fmt.Sprintf(format, a...) + panic(fmt.Errorf("%s\ncurToken: %s\npeekToken: %s\nerrors: %s", msg, p.curToken, p.peekToken, p.errors)) +} + +func (p *Parser) peekPrecedence() int { + t := p.peekToken.Type + if p, ok := precedences[t]; ok { + return p + } + + p.crash("precedence for %s not found", t) // TODO remove + return LowestPrec +} + +func (p *Parser) curPrecedence() int { + t := p.curToken.Type + if p, ok := precedences[t]; ok { + return p + } + + p.crash("precedence for %s not found", t) // TODO remove + return LowestPrec +} + +func (p *Parser) addParsingError(format string, a ...interface{}) { + err := fmt.Sprintf(format, a...) + p.errors = append(p.errors, &Error{Err: err}) + + if p.config.crashOnError { + p.crash(format, a...) + } +} + +// Errors returns parsing errors, if any. +func (p *Parser) Errors() []error { + return p.errors +} + +// nextToken advances curToken and peekToken. +func (p *Parser) nextToken() { + p.curToken = p.peekToken + p.peekToken = p.s.NextToken() +} + +func (p *Parser) expectCurrent(tt ...tokens.Type) bool { + if len(tt) == 0 { + p.crash("expectCurrent called with zero token types") + } + + for _, t := range tt { + if p.curToken.Type == t { + return true + } + } + + switch l := len(tt); l { + case 1: + p.addParsingError("expected current token to be %s, got %s instead", tt[0], p.curToken.Type) + default: + expected := make([]string, l) + for i, t := range tt { + expected[i] = t.String() + } + exp := strings.Join(expected, ", ") + p.addParsingError("expected current token to be one of %s, got %s instead", exp, p.curToken.Type) + } + + return false + +} + +func (p *Parser) expectPeek(tt ...tokens.Type) bool { + if len(tt) == 0 { + p.crash("expectPeek called with zero token types") + } + + for _, t := range tt { + if p.peekToken.Type == t { + p.nextToken() + return true + } + } + + switch l := len(tt); l { + case 1: + p.addParsingError("expected next token to be %s, got %s instead", tt[0], p.peekToken) + default: + expected := make([]string, l) + for i, t := range tt { + expected[i] = t.String() + } + exp := strings.Join(expected, ", ") + p.addParsingError("expected next token to be one of %s, got %s instead", exp, p.peekToken) + } + + return false +} + +func (p *Parser) registerPrefix(tokenType tokens.Type, fn prefixParseFn) { + if _, ok := p.prefixParseFns[tokenType]; ok { + p.crash("prefix function for %s already registered", tokenType) + } + p.prefixParseFns[tokenType] = fn +} + +func (p *Parser) registerInfix(tokenType tokens.Type, fn infixParseFn) { + if _, ok := p.infixParseFns[tokenType]; ok { + p.crash("infix function for %s already registered", tokenType) + } + p.infixParseFns[tokenType] = fn +} + +func (p *Parser) parseComment() ast.Expression { + // TODO + return nil +} + +func (p *Parser) parseIntegerLiteral() ast.Expression { + lit := &ast.IntegerLiteral{Token: p.curToken} + + value, err := strconv.ParseInt(p.curToken.Literal, 0, 64) + if err != nil { + p.addParsingError("could not parse %q as integer", p.curToken.Literal) + return nil + } + + lit.Value = int(value) + return lit +} + +func (p *Parser) parseStringLiteral() ast.Expression { + s := p.curToken.Literal + if !strings.HasPrefix(s, `"`) || !strings.HasSuffix(s, `"`) { + p.addParsingError("could not parse %q as string", s) + return nil + } + return &ast.StringLiteral{Token: p.curToken, Value: s[1 : len(s)-1]} +} + +func (p *Parser) parseIdentifier() ast.Expression { + return &ast.Identifier{Token: p.curToken, Value: p.curToken.Literal} +} + +func (p *Parser) parseBooleanLiteral() ast.Expression { + t := p.curToken.Type == tokens.True + return &ast.BooleanLiteral{ + Token: p.curToken, + Value: t, + } +} + +func (p *Parser) parseExpression(precedence int) ast.Expression { + prefix := p.prefixParseFns[p.curToken.Type] + if prefix == nil { + p.addParsingError("no prefix parse function for %s found (token %s)", p.curToken.Type, p.curToken) + return nil + } + leftExp := prefix() + + for p.peekToken.Type != tokens.Semicolon && precedence < p.peekPrecedence() { + infix := p.infixParseFns[p.peekToken.Type] + if infix == nil { + return leftExp + } + p.nextToken() + leftExp = infix(leftExp) + } + + return leftExp +} + +func (p *Parser) parsePrefixExpression() ast.Expression { + expression := &ast.PrefixExpression{ + Token: p.curToken, + } + + p.nextToken() + expression.Right = p.parseExpression(UnaryPrec) + + return expression +} + +func (p *Parser) parseGroupedExpression() ast.Expression { + p.nextToken() + exp := p.parseExpression(LowestPrec) + if !p.expectPeek(tokens.RPAREN) { + return nil + } + return exp +} + +func (p *Parser) parseBlockStatement() *ast.BlockStatement { + if !p.expectCurrent(tokens.LBRACE) { + return nil + } + block := &ast.BlockStatement{Token: p.curToken} + block.Statements = make([]ast.Statement, 0, 1) + + p.nextToken() + + for p.curToken.Type != tokens.RBRACE && p.curToken.Type != tokens.EOF { + stmt := p.parseStatement() + if stmt != nil { + block.Statements = append(block.Statements, stmt) + } + p.nextToken() + } + + return block +} + +func (p *Parser) parseFunctionLiteral() ast.Expression { + lit := &ast.FunctionLiteral{Token: p.curToken} + + if !p.expectPeek(tokens.LPAREN) { + return nil + } + + lit.Parameters = p.parseFunctionParameters() + + if !p.expectPeek(tokens.LBRACE) { + return nil + } + + lit.Body = p.parseBlockStatement() + return lit +} + +func (p *Parser) parseFunctionParameters() []*ast.Identifier { + if p.peekToken.Type == tokens.RPAREN { + p.nextToken() + return []*ast.Identifier{} + } + + p.nextToken() + + identifiers := []*ast.Identifier{{Token: p.curToken, Value: p.curToken.Literal}} + + for p.peekToken.Type == tokens.Comma { + p.nextToken() + p.nextToken() + identifiers = append(identifiers, &ast.Identifier{Token: p.curToken, Value: p.curToken.Literal}) + } + + if !p.expectPeek(tokens.RPAREN) { + return nil + } + + return identifiers +} + +func (p *Parser) parseInfixExpression(left ast.Expression) ast.Expression { + expression := &ast.InfixExpression{ + Token: p.curToken, + Left: left, + } + + precedence := p.curPrecedence() + p.nextToken() + expression.Right = p.parseExpression(precedence) + + return expression +} + +func (p *Parser) parseCallExpression(function ast.Expression) ast.Expression { + exp := &ast.CallExpression{Token: p.curToken, Function: function} + exp.Arguments = p.parseCallArguments() + return exp +} + +func (p *Parser) parseCallArguments() []ast.Expression { + args := []ast.Expression{} + + if p.peekToken.Type == tokens.RPAREN { + p.nextToken() + return args + } + + p.nextToken() + args = append(args, p.parseExpression(LowestPrec)) + + for p.peekToken.Type == tokens.Comma { + p.nextToken() + p.nextToken() + args = append(args, p.parseExpression(LowestPrec)) + } + + if !p.expectPeek(tokens.RPAREN) { + return nil + } + + return args +} + +func (p *Parser) parseVarStatement() *ast.VarStatement { + stmt := &ast.VarStatement{Token: p.curToken} + if !p.expectPeek(tokens.Identifier) { + return nil + } + + stmt.Name = &ast.Identifier{Token: p.curToken, Value: p.curToken.Literal} + if !p.expectPeek(tokens.Assignment) { + return nil + } + + p.nextToken() + stmt.Value = p.parseExpression(LowestPrec) + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + + return stmt +} + +func (p *Parser) parseIfStatement() *ast.IfStatement { + if !p.expectCurrent(tokens.If) { + return nil + } + stmt := &ast.IfStatement{Token: p.curToken} + + if !p.expectPeek(tokens.LPAREN) { + return nil + } + p.nextToken() + stmt.Cond = p.parseExpression(LowestPrec) + + if !p.expectPeek(tokens.RPAREN) { + return nil + } + if !p.expectPeek(tokens.LBRACE) { + return nil + } + stmt.Body = p.parseBlockStatement() + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + + return stmt +} + +var assignTokens = []tokens.Type{ + tokens.Assignment, + tokens.SumAssignment, + tokens.DifferenceAssignment, + tokens.ProductAssignment, + tokens.QuotientAssignment, + tokens.RemainderAssignment, +} + +func (p *Parser) parseAssignStatement() *ast.AssignStatement { + if !p.expectCurrent(tokens.Identifier) { + return nil + } + stmt := &ast.AssignStatement{ + Name: &ast.Identifier{ + Token: p.curToken, + Value: p.curToken.Literal, + }, + } + + if !p.expectPeek(assignTokens...) { + return nil + } + stmt.Token = p.curToken + + p.nextToken() + stmt.Value = p.parseExpression(LowestPrec) + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + return stmt +} + +func (p *Parser) parseIncrementDecrementStatement() *ast.IncrementDecrementStatement { + if !p.expectCurrent(tokens.Identifier) { + return nil + } + stmt := &ast.IncrementDecrementStatement{ + Name: &ast.Identifier{ + Token: p.curToken, + Value: p.curToken.Literal, + }, + } + + if !p.expectPeek(tokens.Increment, tokens.Decrement) { + return nil + } + stmt.Token = p.curToken + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + return stmt +} + +func (p *Parser) parseReturnStatement() *ast.ReturnStatement { + stmt := &ast.ReturnStatement{Token: p.curToken} + p.nextToken() + stmt.Value = p.parseExpression(LowestPrec) + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + + return stmt +} + +func (p *Parser) parseContinueStatement() *ast.ContinueStatement { + if !p.expectCurrent(tokens.Continue) { + return nil + } + stmt := &ast.ContinueStatement{ + Token: p.curToken, + } + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + return stmt +} + +func (p *Parser) parseForStatement() *ast.ForStatement { + if !p.expectCurrent(tokens.For) { + return nil + } + stmt := &ast.ForStatement{Token: p.curToken} + + p.nextToken() + stmt.Init = p.parseAssignStatement() + + if !p.expectCurrent(tokens.Semicolon) { + return nil + } + p.nextToken() + stmt.Cond = p.parseExpression(LowestPrec) + + if !p.expectPeek(tokens.Semicolon) { + return nil + } + p.nextToken() + stmt.Post = p.parseStatement() + + p.nextToken() + stmt.Body = p.parseBlockStatement() + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + + return stmt +} + +func (p *Parser) parseExpressionOrAssignmentStatement() ast.Statement { + cur := p.curToken + exp := p.parseExpression(LowestPrec) + + var stmt ast.Statement + switch p.peekToken.Type { + case tokens.Increment, tokens.Decrement: + stmt = p.parseIncrementDecrementStatement() + default: + for _, t := range assignTokens { + if p.peekToken.Type == t { + stmt = p.parseAssignStatement() + break + } + } + if stmt == nil { + stmt = &ast.ExpressionStatement{Token: cur, Expression: exp} + } + } + + for p.peekToken.Type == tokens.Semicolon { + p.nextToken() + } + + return stmt +} + +func (p *Parser) parseStatement() ast.Statement { + switch p.curToken.Type { + case tokens.Var: + return p.parseVarStatement() + case tokens.If: + return p.parseIfStatement() + case tokens.Return: + return p.parseReturnStatement() + case tokens.Continue: + return p.parseContinueStatement() + case tokens.For: + return p.parseForStatement() + default: + return p.parseExpressionOrAssignmentStatement() + } +} + +// ParseProgram parsers the whole program and returns root AST node, or nil, if error are encountered. +func (p *Parser) ParseProgram() *ast.Program { + program := &ast.Program{ + Statements: make([]ast.Statement, 0, 8), + } + + for p.curToken.Type != tokens.EOF { + stmt := p.parseStatement() + if stmt != nil { + program.Statements = append(program.Statements, stmt) + } + p.nextToken() + } + + // TODO not sure about it + if len(p.Errors()) != 0 { + program = nil + } + + return program +} diff --git a/parser/parser_fuzz.go b/parser/parser_fuzz.go new file mode 100644 index 0000000..217d0a1 --- /dev/null +++ b/parser/parser_fuzz.go @@ -0,0 +1,31 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// +build gofuzz + +package parser + +import ( + "gosh-lang.org/gosh/scanner" +) + +func Fuzz(data []byte) int { + s, err := scanner.New(string(data), &scanner.Config{ + SkipShebang: true, + }) + if err != nil { + return 0 + } + + p := New(s, nil) + program := p.ParseProgram() + if len(p.Errors()) > 0 { + return 0 + } + program.String() + return 1 +} diff --git a/parser/parser_test.go b/parser/parser_test.go new file mode 100644 index 0000000..e46280e --- /dev/null +++ b/parser/parser_test.go @@ -0,0 +1,293 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package parser + +import ( + "strings" + "testing" + + "github.com/davecgh/go-spew/spew" + "github.com/pmezard/go-difflib/difflib" + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "gosh-lang.org/gosh/ast" + "gosh-lang.org/gosh/internal/gofuzz" + "gosh-lang.org/gosh/internal/golden" + "gosh-lang.org/gosh/scanner" + "gosh-lang.org/gosh/tokens" +) + +var spewConfig = &spew.ConfigState{ + Indent: " ", + DisableMethods: true, + DisablePointerMethods: true, + DisablePointerAddresses: true, + DisableCapacities: true, + ContinueOnMethod: true, +} + +func assertEqual(t *testing.T, expected, actual []ast.Statement) { + t.Helper() + + if !assert.Equal(t, expected, actual) { + e := spewConfig.Sdump(expected) + a := spewConfig.Sdump(actual) + diff, err := difflib.GetUnifiedDiffString(difflib.UnifiedDiff{ + A: difflib.SplitLines(e), + B: difflib.SplitLines(a), + FromFile: "Expected", + FromDate: "", + ToFile: "Actual", + ToDate: "", + Context: 1, + }) + require.NoError(t, err) + t.Logf("\n%s", diff) + } +} + +func formatErrors(errors []error) string { + var res strings.Builder + for _, err := range errors { + res.WriteString(err.Error()) + res.WriteString("\n") + } + return res.String() +} + +func TestGolden(t *testing.T) { + for _, f := range golden.Data { + t.Run(f.File, func(t *testing.T) { + s, err := scanner.New(f.Source, &scanner.Config{ + SkipShebang: true, + }) + require.NoError(t, err) + p := New(s, &Config{ + crashOnError: true, + }) + program := p.ParseProgram() + require.Nil(t, p.Errors(), "%s", formatErrors(p.Errors())) + + actual := spewConfig.Sdump(program) + assert.Equal(t, f.AST, strings.Split(actual, "\n"), "actual:\n%s", actual) + + actual = program.String() + assert.Equal(t, f.Text, strings.Split(actual, "\n"), "actual:\n%s", actual) + }) + } +} + +func TestParser(t *testing.T) { + for source, expected := range map[string]ast.Statement{ + "var answer = 42": &ast.VarStatement{ + Token: tokens.Token{Offset: 0, Type: tokens.Var, Literal: "var"}, + Name: &ast.Identifier{ + Token: tokens.Token{Offset: 4, Type: tokens.Identifier, Literal: "answer"}, + Value: "answer", + }, + Value: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 13, Type: tokens.Integer, Literal: "42"}, + Value: 42, + }, + }, + + "answer = 42": &ast.AssignStatement{ + Token: tokens.Token{Offset: 7, Type: tokens.Assignment, Literal: "="}, + Name: &ast.Identifier{ + Token: tokens.Token{Offset: 0, Type: tokens.Identifier, Literal: "answer"}, + Value: "answer", + }, + Value: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 9, Type: tokens.Integer, Literal: "42"}, + Value: 42, + }, + }, + + "answer == 42": &ast.ExpressionStatement{ + Token: tokens.Token{Offset: 0, Type: tokens.Identifier, Literal: "answer"}, + Expression: &ast.InfixExpression{ + Token: tokens.Token{Offset: 7, Type: tokens.Equal, Literal: "=="}, + Left: &ast.Identifier{ + Token: tokens.Token{Offset: 0, Type: tokens.Identifier, Literal: "answer"}, + Value: "answer", + }, + Right: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 10, Type: tokens.Integer, Literal: "42"}, + Value: 42, + }, + }, + }, + + "answer += 42": &ast.AssignStatement{ + Token: tokens.Token{Offset: 7, Type: tokens.SumAssignment, Literal: "+="}, + Name: &ast.Identifier{ + Token: tokens.Token{Offset: 0, Type: tokens.Identifier, Literal: "answer"}, + Value: "answer", + }, + Value: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 10, Type: tokens.Integer, Literal: "42"}, + Value: 42, + }, + }, + + "answer++": &ast.IncrementDecrementStatement{ + Token: tokens.Token{Offset: 6, Type: tokens.Increment, Literal: "++"}, + Name: &ast.Identifier{ + Token: tokens.Token{Offset: 0, Type: tokens.Identifier, Literal: "answer"}, + Value: "answer", + }, + }, + + "return 42": &ast.ReturnStatement{ + Token: tokens.Token{Offset: 0, Type: tokens.Return, Literal: "return"}, + Value: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 7, Type: tokens.Integer, Literal: "42"}, + Value: 42, + }, + }, + + "if (6 * 9 == 42) {\ntrue;\nfalse;\n}": &ast.IfStatement{ + Token: tokens.Token{Offset: 0, Type: tokens.If, Literal: "if"}, + Cond: &ast.InfixExpression{ + Token: tokens.Token{Offset: 10, Type: tokens.Equal, Literal: "=="}, + Left: &ast.InfixExpression{ + Token: tokens.Token{Offset: 6, Type: tokens.Product, Literal: "*"}, + Left: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 4, Type: tokens.Integer, Literal: "6"}, + Value: 6, + }, + Right: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 8, Type: tokens.Integer, Literal: "9"}, + Value: 9, + }, + }, + Right: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 13, Type: tokens.Integer, Literal: "42"}, + Value: 42, + }, + }, + Body: &ast.BlockStatement{ + Token: tokens.Token{Offset: 17, Type: tokens.LBRACE, Literal: "{"}, + Statements: []ast.Statement{ + &ast.ExpressionStatement{ + Token: tokens.Token{Offset: 19, Type: tokens.True, Literal: "true"}, + Expression: &ast.BooleanLiteral{ + Token: tokens.Token{Offset: 19, Type: tokens.True, Literal: "true"}, + Value: true, + }, + }, + &ast.ExpressionStatement{ + Token: tokens.Token{Offset: 25, Type: tokens.False, Literal: "false"}, + Expression: &ast.BooleanLiteral{ + Token: tokens.Token{Offset: 25, Type: tokens.False, Literal: "false"}, + Value: false, + }, + }, + }, + }, + }, + + "for i = 1; i <= 100; i++ {\n}": &ast.ForStatement{ + Token: tokens.Token{Offset: 0, Type: tokens.For, Literal: "for"}, + Init: &ast.AssignStatement{ + Token: tokens.Token{Offset: 6, Type: tokens.Assignment, Literal: "="}, + Name: &ast.Identifier{ + Token: tokens.Token{Offset: 4, Type: tokens.Identifier, Literal: "i"}, + Value: "i", + }, + Value: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 8, Type: tokens.Integer, Literal: "1"}, + Value: 1, + }, + }, + Cond: &ast.InfixExpression{ + Token: tokens.Token{Offset: 13, Type: tokens.LessOrEqual, Literal: "<="}, + Left: &ast.Identifier{ + Token: tokens.Token{Offset: 11, Type: tokens.Identifier, Literal: "i"}, + Value: "i", + }, + Right: &ast.IntegerLiteral{ + Token: tokens.Token{Offset: 16, Type: tokens.Integer, Literal: "100"}, + Value: 100, + }, + }, + Post: &ast.IncrementDecrementStatement{ + Token: tokens.Token{Offset: 22, Type: tokens.Increment, Literal: "++"}, + Name: &ast.Identifier{ + Token: tokens.Token{Offset: 21, Type: tokens.Identifier, Literal: "i"}, + Value: "i", + }, + }, + Body: &ast.BlockStatement{ + Token: tokens.Token{Offset: 25, Type: tokens.LBRACE, Literal: "{"}, + Statements: []ast.Statement{}, + }, + }, + + `println("answer")`: &ast.ExpressionStatement{ + Token: tokens.Token{Offset: 0, Type: tokens.Identifier, Literal: "println"}, + Expression: &ast.CallExpression{ + Token: tokens.Token{Offset: 7, Type: tokens.LPAREN, Literal: "("}, + Function: &ast.Identifier{ + Token: tokens.Token{Offset: 0, Type: tokens.Identifier, Literal: "println"}, + Value: "println", + }, + Arguments: []ast.Expression{ + &ast.StringLiteral{ + Token: tokens.Token{Offset: 8, Type: tokens.String, Literal: `"answer"`}, + Value: "answer", + }, + }, + }, + }, + } { + t.Run(source, func(t *testing.T) { + formal := source + ";\n" + for _, input := range []string{source, source + ";", source + "\n", formal, source + ";\n\n;;"} { + gofuzz.AddDataToCorpus("parser", []byte(input)) + + s, err := scanner.New(input, &scanner.Config{ + SkipShebang: true, + }) + require.NoError(t, err) + p := New(s, &Config{ + // crashOnError: true, + }) + program := p.ParseProgram() + require.Nil(t, p.Errors(), "%s", formatErrors(p.Errors())) + require.NotNil(t, program) + assertEqual(t, []ast.Statement{expected}, program.Statements) + assert.Equal(t, formal, program.String()) + assert.Equal(t, tokens.Token{Offset: len(input), Type: tokens.EOF}, p.curToken) + } + }) + } +} + +func TestErrors(t *testing.T) { + for input, errors := range map[string][]error{ + `(`: { + &Error{Err: "no prefix parse function for EOF found (token [ 1: EOF ])"}, + &Error{Err: "expected next token to be RPAREN, got [ 2: EOF ] instead"}, + }, + } { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("parser", []byte(input)) + + s, err := scanner.New(input, &scanner.Config{ + SkipShebang: true, + }) + require.NoError(t, err) + p := New(s, nil) + program := p.ParseProgram() + assert.Nil(t, program) + assert.Equal(t, errors, p.Errors()) + }) + } +} diff --git a/scanner/scanner.go b/scanner/scanner.go new file mode 100644 index 0000000..940b63c --- /dev/null +++ b/scanner/scanner.go @@ -0,0 +1,450 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// Package scanner implements a scanner for Gosh source text. +package scanner + +import ( + "fmt" + + "gosh-lang.org/gosh/tokens" +) + +// Scanner extracts tokens from Gosh source code. +type Scanner struct { + config *Config + input []rune + + rPos int // current rune position in input + r rune // current rune + insertSemicolon bool // return next \n as semicolon +} + +// Config configures scanner. +type Config struct { + SkipShebang bool // if true, scanner will skip the first line of input if it starts with #! + + crashOnError bool // crash scanner on any illegal token, for testing only + dontInsertSemicolon bool // disable automatic semicolon insertion, for testing only +} + +var keywords = map[string]tokens.Type{ + "break": tokens.Break, + "case": tokens.Case, + "chan": tokens.Chan, + "const": tokens.Const, + "continue": tokens.Continue, + "default": tokens.Default, + "defer": tokens.Defer, + "else": tokens.Else, + "fallthrough": tokens.Fallthrough, + "for": tokens.For, + "func": tokens.Func, + "go": tokens.Go, + "goto": tokens.Goto, + "if": tokens.If, + "import": tokens.Import, + "interface": tokens.Interface, + "map": tokens.Map, + "package": tokens.Package, + "range": tokens.Range, + "return": tokens.Return, + "select": tokens.Select, + "struct": tokens.Struct, + "switch": tokens.Switch, + "var": tokens.Var, + + // TODO remove - those are not keywords + "true": tokens.True, + "false": tokens.False, +} + +// New creates new scanner for the given Gosh source code. +func New(input string, config *Config) (*Scanner, error) { + if config == nil { + config = new(Config) + } + + runes := []rune(input) + for _, r := range runes { + if r == 0 { + return nil, fmt.Errorf("input contains NUL character (U+0000)") + } + } + + l := &Scanner{ + config: config, + input: runes, + rPos: -1, + } + l.readRune() + return l, nil +} + +func isLetter(r rune) bool { + switch { + case 'a' <= r && r <= 'z': + return true + case 'A' <= r && r <= 'Z': + return true + case r == '_': + return true + default: + return false + } +} + +func isDigit(r rune) bool { + return '0' <= r && r <= '9' +} + +func (s *Scanner) crash(format string, a ...interface{}) { + msg := fmt.Sprintf(format, a...) + panic(fmt.Errorf("%s\nrPos: %d\nr: %q", msg, s.rPos, s.r)) +} + +func (s *Scanner) peekRune() rune { + peekPos := s.rPos + 1 + if peekPos >= len(s.input) { + return 0 + } + return s.input[peekPos] +} + +func (s *Scanner) readRune() { + s.r = s.peekRune() + s.rPos++ +} + +func (s *Scanner) skipWhitespace() { + for { + switch s.r { + case ' ', '\t', '\r': + s.readRune() + case '\n': + if s.insertSemicolon { + return + } + s.readRune() + default: + return + } + } +} + +// readLine reads and returns line up to '\n' or EOF. +func (s *Scanner) readLine() string { + pos := s.rPos + for { + s.readRune() + if s.r == '\n' || s.r == 0 { + break + } + } + return string(s.input[pos:s.rPos]) +} + +func (s *Scanner) readInt() (string, bool) { + pos := s.rPos + for isDigit(s.r) { + s.readRune() + } + ok := true + if isLetter(s.r) { + ok = false + for isLetter(s.r) || isDigit(s.r) { + s.readRune() + } + } + return string(s.input[pos:s.rPos]), ok +} + +func (s *Scanner) readString() (string, bool) { + pos := s.rPos + for { + s.readRune() + if s.r == '"' || s.r == 0 { + break + } + } + ok := s.r == '"' + if ok { + s.readRune() + } + return string(s.input[pos:s.rPos]), ok +} + +func (s *Scanner) readIdentifier() string { + pos := s.rPos + for isLetter(s.r) || isDigit(s.r) { + s.readRune() + } + return string(s.input[pos:s.rPos]) +} + +func (s *Scanner) lookupIdentifier(ident string) tokens.Type { + if t, ok := keywords[ident]; ok { + return t + } + return tokens.Identifier +} + +// NextToken returns next scanned tokens. +// Once it returns tokens.EOF, it will continue to do so. +//nolint:gocyclo +func (s *Scanner) NextToken() tokens.Token { + s.skipWhitespace() + tok := tokens.Token{Offset: s.rPos, Type: tokens.Illegal} + + if s.config.crashOnError { + defer func() { + if tok.Type == tokens.Illegal { + s.crash("illegal token: %s", tok) + } + }() + } + + var insertSemicolon bool + defer func() { + if !s.config.dontInsertSemicolon { + s.insertSemicolon = insertSemicolon + } + }() + + switch s.r { + case 0: + tok.Type = tokens.EOF + case '\n': + // s.skipWhitespace() exited on \n + tok.Type = tokens.Semicolon + tok.Literal = "\n" + case '#': + if s.rPos == 0 && s.peekRune() == '!' && s.config.SkipShebang { + s.readLine() + return s.NextToken() + } + tok.Literal = string(s.r) + + case '=': + switch s.peekRune() { + case '=': + s.readRune() + tok.Type = tokens.Equal + tok.Literal = "==" + default: + tok.Type = tokens.Assignment + tok.Literal = "=" + } + case ':': + switch s.peekRune() { + case '=': + s.readRune() + tok.Type = tokens.Define + tok.Literal = ":=" + default: + tok.Type = tokens.Colon + tok.Literal = ":" + } + + case '+': + switch s.peekRune() { + case '+': + s.readRune() + tok.Type = tokens.Increment + tok.Literal = "++" + insertSemicolon = true + case '=': + s.readRune() + tok.Type = tokens.SumAssignment + tok.Literal = "+=" + default: + tok.Type = tokens.Sum + tok.Literal = "+" + } + case '-': + switch s.peekRune() { + case '-': + s.readRune() + tok.Type = tokens.Decrement + tok.Literal = "--" + insertSemicolon = true + case '=': + s.readRune() + tok.Type = tokens.DifferenceAssignment + tok.Literal = "-=" + default: + tok.Type = tokens.Difference + tok.Literal = "-" + } + case '*': + switch s.peekRune() { + case '=': + s.readRune() + tok.Type = tokens.ProductAssignment + tok.Literal = "*=" + default: + tok.Type = tokens.Product + tok.Literal = "*" + } + case '/': + switch s.peekRune() { + case '/': + tok.Type = tokens.Comment + tok.Literal = s.readLine() + case '=': + s.readRune() + tok.Type = tokens.QuotientAssignment + tok.Literal = "/=" + default: + tok.Type = tokens.Quotient + tok.Literal = "/" + } + case '%': + switch s.peekRune() { + case '=': + s.readRune() + tok.Type = tokens.RemainderAssignment + tok.Literal = "%=" + default: + tok.Type = tokens.Remainder + tok.Literal = "%" + } + + case '&': + switch s.peekRune() { + case '&': + s.readRune() + tok.Type = tokens.LogicalAnd + tok.Literal = "&&" + default: + tok.Type = tokens.BitwiseAnd + tok.Literal = "&" + } + case '|': + switch s.peekRune() { + case '|': + s.readRune() + tok.Type = tokens.LogicalOr + tok.Literal = "||" + default: + tok.Type = tokens.BitwiseOr + tok.Literal = "|" + } + case '^': + tok.Type = tokens.BitwiseXor + tok.Literal = "^" + + case '!': + switch s.peekRune() { + case '=': + s.readRune() + tok.Type = tokens.NotEqual + tok.Literal = "!=" + default: + tok.Type = tokens.Not + tok.Literal = "!" + } + + case '<': + switch s.peekRune() { + case '=': + s.readRune() + tok.Type = tokens.LessOrEqual + tok.Literal = "<=" + default: + tok.Type = tokens.Less + tok.Literal = "<" + } + case '>': + switch s.peekRune() { + case '=': + s.readRune() + tok.Type = tokens.GreaterOrEqual + tok.Literal = ">=" + default: + tok.Type = tokens.Greater + tok.Literal = ">" + } + + case ';': + tok.Type = tokens.Semicolon + tok.Literal = ";" + case ',': + tok.Type = tokens.Comma + tok.Literal = "," + case '.': + tok.Type = tokens.Period + tok.Literal = "." + + case '(': + tok.Type = tokens.LPAREN + tok.Literal = "(" + case ')': + tok.Type = tokens.RPAREN + tok.Literal = ")" + insertSemicolon = true + case '{': + tok.Type = tokens.LBRACE + tok.Literal = "{" + case '}': + tok.Type = tokens.RBRACE + tok.Literal = "}" + insertSemicolon = true + + case '"': + lit, ok := s.readString() + tok.Literal = lit + if ok { + tok.Type = tokens.String + } + insertSemicolon = true + return tok // l.readRune() already called by l.readString(), so exit early + + default: + switch { + case isLetter(s.r): + tok.Literal = s.readIdentifier() + tok.Type = s.lookupIdentifier(tok.Literal) + switch tok.Type { + case tokens.Identifier, tokens.Break, tokens.Continue, tokens.Fallthrough, tokens.Return: + fallthrough + case tokens.True, tokens.False: // TODO remove + insertSemicolon = true + } + return tok // l.readRune() already called by l.readIdentifier(), so exit early + + case isDigit(s.r): + lit, ok := s.readInt() + tok.Literal = lit + if ok { + tok.Type = tokens.Integer + } + insertSemicolon = true + return tok // l.readRune() already called by l.readInt(), so exit early + + default: + // TODO insertSemicolon? + tok.Literal = string(s.r) + } + } + + s.readRune() + return tok +} + +// allTokens returns all tokens until tokens.EOF or tokens.ILLEGAL. +func (s *Scanner) allTokens() []tokens.Token { + var res []tokens.Token + for { + tok := s.NextToken() + res = append(res, tok) + switch tok.Type { + case tokens.EOF, tokens.Illegal: + return res + } + } +} diff --git a/scanner/scanner_fuzz.go b/scanner/scanner_fuzz.go new file mode 100644 index 0000000..8ff7954 --- /dev/null +++ b/scanner/scanner_fuzz.go @@ -0,0 +1,66 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// +build gofuzz + +package scanner + +import ( + "fmt" + + "gosh-lang.org/gosh/tokens" +) + +func logTokens(tokens []tokens.Token) { + for _, t := range tokens { + fmt.Println(t) + } +} + +func Fuzz(data []byte) int { + input := string(data) + l := len(input) // TODO len([]rune(input)) + s, err := New(input, &Config{ + SkipShebang: true, + }) + if err != nil { + return 0 + } + + t := s.allTokens() + if len(t) == 0 { + panic("should not return 0 tokens") + } + + offset := -1 + for _, tok := range t { + if offset >= tok.Offset { + logTokens(t) + panic(fmt.Sprintf("unexpected offset for token %s (previous offset: %d)", tok, offset)) + } + offset = tok.Offset + } + + last := t[len(t)-1] + if last.Type == tokens.Illegal { + if last.Offset == l { + logTokens(t) + panic(fmt.Sprintf("unexpected last illegal token offset: %d", last.Offset)) + } + return 0 + } + + if last.Type != tokens.EOF { + logTokens(t) + panic(fmt.Sprintf("unexpected last token: %s", last)) + } + if last.Offset != l { + logTokens(t) + panic(fmt.Sprintf("unexpected last token offset: %d (expected: %d)", last.Offset, l)) + } + return 1 +} diff --git a/scanner/scanner_test.go b/scanner/scanner_test.go new file mode 100644 index 0000000..a85e972 --- /dev/null +++ b/scanner/scanner_test.go @@ -0,0 +1,361 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package scanner + +import ( + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "gosh-lang.org/gosh/internal/gofuzz" + "gosh-lang.org/gosh/internal/golden" + "gosh-lang.org/gosh/tokens" +) + +func TestGolden(t *testing.T) { + for _, f := range golden.Data { + t.Run(f.File, func(t *testing.T) { + var actual []string + s, err := New(f.Source, &Config{ + SkipShebang: true, + }) + require.NoError(t, err) + for _, tok := range s.allTokens() { + actual = append(actual, tok.String()) + } + assert.Equal(t, strings.Split(f.Tokens, "\n"), actual, "actual:\n%s", strings.Join(actual, "\n")) + }) + } +} + +func TestScanner(t *testing.T) { + // in order of tokens.Type constants + testdata := map[string][]tokens.Token{ + `#`: { + {Type: tokens.Illegal, Literal: `#`}, + }, + `…`: { + {Type: tokens.Illegal, Literal: `…`}, + }, + `42foo`: { + {Type: tokens.Illegal, Literal: `42foo`}, + }, + `"Invalid`: { + {Type: tokens.Illegal, Literal: `"Invalid`}, + }, + ``: { + {Type: tokens.EOF}, + }, + + // FIXME fix code and enable those tests + "// Comment 1\n// Comment 2": { + {Offset: 0, Type: tokens.Comment, Literal: "// Comment 1"}, + {Offset: 13, Type: tokens.Comment, Literal: "// Comment 2"}, + {Offset: 25, Type: tokens.EOF}, + }, + "// Comment 1\n// Comment 2\n": { + {Offset: 0, Type: tokens.Comment, Literal: "// Comment 1"}, + {Offset: 13, Type: tokens.Comment, Literal: "// Comment 2"}, + {Offset: 26, Type: tokens.EOF}, + }, + + `foo FOO _ foo42`: { + {Offset: 0, Type: tokens.Identifier, Literal: `foo`}, + {Offset: 4, Type: tokens.Identifier, Literal: `FOO`}, + {Offset: 8, Type: tokens.Identifier, Literal: `_`}, + {Offset: 10, Type: tokens.Identifier, Literal: `foo42`}, + {Offset: 15, Type: tokens.EOF}, + }, + `42 042`: { + {Offset: 0, Type: tokens.Integer, Literal: `42`}, + {Offset: 3, Type: tokens.Integer, Literal: `042`}, + {Offset: 6, Type: tokens.EOF}, + }, + // TODO Float + // TODO Character, Rune, Byte? + `"Hello, world!"`: { + {Offset: 0, Type: tokens.String, Literal: `"Hello, world!"`}, + {Offset: 15, Type: tokens.EOF}, + }, + + `=:=`: { + {Offset: 0, Type: tokens.Assignment, Literal: `=`}, + {Offset: 1, Type: tokens.Define, Literal: `:=`}, + {Offset: 3, Type: tokens.EOF}, + }, + + `+-*/%`: { + {Offset: 0, Type: tokens.Sum, Literal: `+`}, + {Offset: 1, Type: tokens.Difference, Literal: `-`}, + {Offset: 2, Type: tokens.Product, Literal: `*`}, + {Offset: 3, Type: tokens.Quotient, Literal: `/`}, + {Offset: 4, Type: tokens.Remainder, Literal: `%`}, + {Offset: 5, Type: tokens.EOF}, + }, + + `+=-=*=/=%=`: { + {Offset: 0, Type: tokens.SumAssignment, Literal: `+=`}, + {Offset: 2, Type: tokens.DifferenceAssignment, Literal: `-=`}, + {Offset: 4, Type: tokens.ProductAssignment, Literal: `*=`}, + {Offset: 6, Type: tokens.QuotientAssignment, Literal: `/=`}, + {Offset: 8, Type: tokens.RemainderAssignment, Literal: `%=`}, + {Offset: 10, Type: tokens.EOF}, + }, + + `++--`: { + {Offset: 0, Type: tokens.Increment, Literal: `++`}, + {Offset: 2, Type: tokens.Decrement, Literal: `--`}, + {Offset: 4, Type: tokens.EOF}, + }, + + `&|^`: { + {Offset: 0, Type: tokens.BitwiseAnd, Literal: `&`}, + {Offset: 1, Type: tokens.BitwiseOr, Literal: `|`}, + {Offset: 2, Type: tokens.BitwiseXor, Literal: `^`}, + {Offset: 3, Type: tokens.EOF}, + }, + + `&&||`: { + {Offset: 0, Type: tokens.LogicalAnd, Literal: `&&`}, + {Offset: 2, Type: tokens.LogicalOr, Literal: `||`}, + {Offset: 4, Type: tokens.EOF}, + }, + + `!`: { + {Offset: 0, Type: tokens.Not, Literal: `!`}, + {Offset: 1, Type: tokens.EOF}, + }, + + `==!=<=<>>=`: { + {Offset: 0, Type: tokens.Equal, Literal: `==`}, + {Offset: 2, Type: tokens.NotEqual, Literal: `!=`}, + {Offset: 4, Type: tokens.LessOrEqual, Literal: `<=`}, + {Offset: 6, Type: tokens.Less, Literal: `<`}, + {Offset: 7, Type: tokens.Greater, Literal: `>`}, + {Offset: 8, Type: tokens.GreaterOrEqual, Literal: `>=`}, + {Offset: 10, Type: tokens.EOF}, + }, + + `:;,.`: { + {Offset: 0, Type: tokens.Colon, Literal: `:`}, + {Offset: 1, Type: tokens.Semicolon, Literal: `;`}, + {Offset: 2, Type: tokens.Comma, Literal: `,`}, + {Offset: 3, Type: tokens.Period, Literal: `.`}, + {Offset: 4, Type: tokens.EOF}, + }, + + `(){}`: { + {Offset: 0, Type: tokens.LPAREN, Literal: `(`}, + {Offset: 1, Type: tokens.RPAREN, Literal: `)`}, + {Offset: 2, Type: tokens.LBRACE, Literal: `{`}, + {Offset: 3, Type: tokens.RBRACE, Literal: `}`}, + {Offset: 4, Type: tokens.EOF}, + }, + + `break case chan const continue default defer else fallthrough for func go ` + + `goto if import interface map package range return select struct switch var`: { + {Offset: 0, Type: tokens.Break, Literal: `break`}, + {Offset: 6, Type: tokens.Case, Literal: `case`}, + {Offset: 11, Type: tokens.Chan, Literal: `chan`}, + {Offset: 16, Type: tokens.Const, Literal: `const`}, + {Offset: 22, Type: tokens.Continue, Literal: `continue`}, + {Offset: 31, Type: tokens.Default, Literal: `default`}, + {Offset: 39, Type: tokens.Defer, Literal: `defer`}, + {Offset: 45, Type: tokens.Else, Literal: `else`}, + {Offset: 50, Type: tokens.Fallthrough, Literal: `fallthrough`}, + {Offset: 62, Type: tokens.For, Literal: `for`}, + {Offset: 66, Type: tokens.Func, Literal: `func`}, + {Offset: 71, Type: tokens.Go, Literal: `go`}, + {Offset: 74, Type: tokens.Goto, Literal: `goto`}, + {Offset: 79, Type: tokens.If, Literal: `if`}, + {Offset: 82, Type: tokens.Import, Literal: `import`}, + {Offset: 89, Type: tokens.Interface, Literal: `interface`}, + {Offset: 99, Type: tokens.Map, Literal: `map`}, + {Offset: 103, Type: tokens.Package, Literal: `package`}, + {Offset: 111, Type: tokens.Range, Literal: `range`}, + {Offset: 117, Type: tokens.Return, Literal: `return`}, + {Offset: 124, Type: tokens.Select, Literal: `select`}, + {Offset: 131, Type: tokens.Struct, Literal: `struct`}, + {Offset: 138, Type: tokens.Switch, Literal: `switch`}, + {Offset: 145, Type: tokens.Var, Literal: `var`}, + {Offset: 148, Type: tokens.EOF}, + }, + + `true false`: { + {Offset: 0, Type: tokens.True, Literal: `true`}, + {Offset: 5, Type: tokens.False, Literal: `false`}, + {Offset: 10, Type: tokens.EOF}, + }, + } + + for input, tokens := range testdata { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("scanner", []byte(input)) + + if strings.HasPrefix(input, "// ") { + t.Skip("FIXME broken code") + } + + offset := -1 + for _, tok := range tokens { + require.True(t, offset < tok.Offset, "unexpected offset for token %s", tok) + offset = tok.Offset + } + + l, err := New(input, &Config{ + dontInsertSemicolon: true, + }) + require.NoError(t, err) + assert.Equal(t, tokens, l.allTokens(), "Input: %q", input) + }) + } +} + +func TestSemicolonInsertion(t *testing.T) { + input := strings.TrimLeft(` +var +return +break; +continue +fallthrough; + +true +false; + +x +x += 1 +x++ + +foo() +func() {} +`, "\n") + l, err := New(input, nil) + require.NoError(t, err) + expected := []tokens.Token{ + {Offset: 0, Type: tokens.Var, Literal: "var"}, + {Offset: 4, Type: tokens.Return, Literal: "return"}, + {Offset: 10, Type: tokens.Semicolon, Literal: "\n"}, + {Offset: 11, Type: tokens.Break, Literal: "break"}, + {Offset: 16, Type: tokens.Semicolon, Literal: ";"}, + {Offset: 18, Type: tokens.Continue, Literal: "continue"}, + {Offset: 26, Type: tokens.Semicolon, Literal: "\n"}, + {Offset: 27, Type: tokens.Fallthrough, Literal: "fallthrough"}, + {Offset: 38, Type: tokens.Semicolon, Literal: ";"}, + + {Offset: 41, Type: tokens.True, Literal: "true"}, + {Offset: 45, Type: tokens.Semicolon, Literal: "\n"}, + {Offset: 46, Type: tokens.False, Literal: "false"}, + {Offset: 51, Type: tokens.Semicolon, Literal: ";"}, + + {Offset: 54, Type: tokens.Identifier, Literal: "x"}, + {Offset: 55, Type: tokens.Semicolon, Literal: "\n"}, + {Offset: 56, Type: tokens.Identifier, Literal: "x"}, + {Offset: 58, Type: tokens.SumAssignment, Literal: "+="}, + {Offset: 61, Type: tokens.Integer, Literal: "1"}, + {Offset: 62, Type: tokens.Semicolon, Literal: "\n"}, + {Offset: 63, Type: tokens.Identifier, Literal: "x"}, + {Offset: 64, Type: tokens.Increment, Literal: "++"}, + {Offset: 66, Type: tokens.Semicolon, Literal: "\n"}, + + {Offset: 68, Type: tokens.Identifier, Literal: "foo"}, + {Offset: 71, Type: tokens.LPAREN, Literal: "("}, + {Offset: 72, Type: tokens.RPAREN, Literal: ")"}, + {Offset: 73, Type: tokens.Semicolon, Literal: "\n"}, + {Offset: 74, Type: tokens.Func, Literal: "func"}, + {Offset: 78, Type: tokens.LPAREN, Literal: "("}, + {Offset: 79, Type: tokens.RPAREN, Literal: ")"}, + {Offset: 81, Type: tokens.LBRACE, Literal: "{"}, + {Offset: 82, Type: tokens.RBRACE, Literal: "}"}, + {Offset: 83, Type: tokens.Semicolon, Literal: "\n"}, + + {Offset: 84, Type: tokens.EOF}, + } + assert.Equal(t, expected, l.allTokens()) + + t.Run("Without newline", func(t *testing.T) { + testdata := map[string][]tokens.Token{ + `0`: { + {Offset: 0, Type: tokens.Integer, Literal: "0"}, + {Offset: 1, Type: tokens.EOF, Literal: ""}, + }, + `(`: { + {Offset: 0, Type: tokens.LPAREN, Literal: "("}, + {Offset: 1, Type: tokens.EOF, Literal: ""}, + }, + } + + for input, tokens := range testdata { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("scanner", []byte(input)) + + offset := -1 + for _, tok := range tokens { + require.True(t, offset < tok.Offset, "unexpected offset for token %s", tok) + offset = tok.Offset + } + + l, err := New(input, nil) + require.NoError(t, err) + assert.Equal(t, tokens, l.allTokens(), "Input: %q", input) + }) + } + }) +} + +func TestShebang(t *testing.T) { + l, err := New("#!/usr/bin/env gosh\nfoo", &Config{ + SkipShebang: false, + }) + require.NoError(t, err) + expected := []tokens.Token{ + {Type: tokens.Illegal, Literal: `#`}, + } + assert.Equal(t, expected, l.allTokens()) + + l, err = New("#!/usr/bin/env gosh", &Config{ + SkipShebang: false, + }) + require.NoError(t, err) + expected = []tokens.Token{ + {Type: tokens.Illegal, Literal: `#`}, + } + assert.Equal(t, expected, l.allTokens()) + + l, err = New("#!/usr/bin/env gosh\nfoo", &Config{ + SkipShebang: true, + }) + require.NoError(t, err) + expected = []tokens.Token{ + {Offset: 20, Type: tokens.Identifier, Literal: `foo`}, + {Offset: 23, Type: tokens.EOF}, + } + assert.Equal(t, expected, l.allTokens()) + + l, err = New("#!/usr/bin/env gosh", &Config{ + SkipShebang: true, + }) + require.NoError(t, err) + expected = []tokens.Token{ + {Offset: 19, Type: tokens.EOF}, + } + assert.Equal(t, expected, l.allTokens()) +} + +func TestZeroRune(t *testing.T) { + for _, input := range []string{"12\x00", "1\x002"} { + t.Run(input, func(t *testing.T) { + gofuzz.AddDataToCorpus("scanner", []byte(input)) + + l, err := New(input, nil) + require.Error(t, err) + require.Nil(t, l) + }) + } +} diff --git a/tokens/tokens.go b/tokens/tokens.go new file mode 100644 index 0000000..54fe5ed --- /dev/null +++ b/tokens/tokens.go @@ -0,0 +1,38 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// Package tokens defines constants representing the lexical tokens of the Gosh programming language. +package tokens + +import ( + "fmt" +) + +// Token represents lexical token of the Gosh programming language. +type Token struct { + Offset int + Type Type + Literal string +} + +// String returns the string representation of the token. +func (tok Token) String() string { + res := fmt.Sprintf("%d: %s", tok.Offset, tok.Type.String()) + if tok.Literal != "" { + if tok.Type == Semicolon && tok.Literal == "\n" { + res += " newline" + } else { + res += " " + tok.Literal + } + } + return "[ " + res + " ]" +} + +// check interfaces +var ( + _ fmt.Stringer = Token{} +) diff --git a/tokens/type.go b/tokens/type.go new file mode 100644 index 0000000..286b0be --- /dev/null +++ b/tokens/type.go @@ -0,0 +1,137 @@ +// Gosh programming language. +// Copyright (c) 2018 Alexey Palazhchenko and contributors. +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +package tokens + +import ( + "fmt" +) + +// Type is the set of lexical token types of the Gosh programming language. +type Type string + +func (t Type) String() string { + return string(t) +} + +// The list of token types. +const ( + Illegal Type = "ILLEGAL" + + EOF Type = "EOF" + Comment Type = "COMMENT" + + Identifier Type = "IDENTIFIER" + Integer Type = "INTEGER" + // Float = "FLOAT" + // Rune = "RUNE" + String Type = "STRING" + + Assignment Type = "ASSIGNMENT" // = + Define Type = "DEFINE" // := + + Sum Type = "SUM" // + + Difference Type = "DIFFERENCE" // - + Product Type = "PRODUCT" // * + Quotient Type = "QUOTIENT" // / + Remainder Type = "REMAINDER" // % + + SumAssignment Type = "SUM_ASSIGNMENT" // += + DifferenceAssignment Type = "DIFFERENCE_ASSIGNMENT" // -= + ProductAssignment Type = "PRODUCT_ASSIGNMENT" // *= + QuotientAssignment Type = "QUOTIENT_ASSIGNMENT" // /= + RemainderAssignment Type = "REMAINDER_ASSIGNMENT" // %= + + Increment Type = "INCREMENT" // ++ + Decrement Type = "DECREMENT" // -- + + BitwiseAnd Type = "BITWISE_AND" // & + BitwiseOr Type = "BITWISE_OR" // | + BitwiseXor Type = "BITWISE_XOR" // ^ + // TODO &^ + + // TODO &= + // TODO |= + // TODO ^= + // TODO &^= + + // TODO << + // TODO >> + + // TODO <<= + // TODO >>= + + LogicalAnd Type = "LOGICAL_AND" // && + LogicalOr Type = "LOGICAL_OR" // || + + Not Type = "NOT" // ! + + // Ellipsis // ... + + Equal Type = "EQUAL" // == + NotEqual Type = "NOT_EQUAL" // != + Less Type = "LESS" // < + LessOrEqual Type = "LESS_OR_EQUAL" // <= + Greater Type = "GREATER" // > + GreaterOrEqual Type = "GREATER_OR_EQUAL" // >= + + // TODO <- + + // delimiters + Colon Type = "COLON" // : + Semicolon Type = "SEMICOLON" // ; + Comma Type = "COMMA" // , + Period Type = "PERIOD" // . + + // TODO rename those + LPAREN Type = "LPAREN" // ( + RPAREN Type = "RPAREN" // ) + LBRACE Type = "LBRACE" // { + RBRACE Type = "RBRACE" // } + + // keywords + Break Type = "BREAK" + Case Type = "CASE" + Chan Type = "CHAN" + Const Type = "CONST" + Continue Type = "CONTINUE" + Default Type = "DEFAULT" + Defer Type = "DEFER" + Else Type = "ELSE" + Fallthrough Type = "FALLTHROUGH" + For Type = "FOR" + Func Type = "FUNC" + Go Type = "GO" + Goto Type = "GOTO" + If Type = "IF" + Import Type = "IMPORT" + Interface Type = "INTERFACE" + Map Type = "MAP" + Package Type = "PACKAGE" + Range Type = "RANGE" + Return Type = "RETURN" + Select Type = "SELECT" + Struct Type = "STRUCT" + Switch Type = "SWITCH" + // TODO Type + Var Type = "VAR" + + // TODO remove + True Type = "TRUE" + False Type = "FALSE" + + // notwithstanding + // thetruthofthematter + // despiteallobjections + // whereas + // insofaras +) + +// check interfaces +var ( + _ fmt.Stringer = Type("FOO") +)