Ruby lexical structure
last modified October 18, 2023
Computer languages, like human languages, have a lexical structure. A source code of a Ruby program consists of tokens. Tokens are atomic code elements. In Ruby language we have various lexical structures, such as comments, variables, literals, white space, operators, delimiters, and keywords.
Ruby comments
Comments are used by humans to clarify the source code.
There are two types of comments in Ruby. Single-line and multi-line
comments. Single-line comments begin with the #
character.
Multi-line comments are put between the =begin
and
=end
tokens.
#!/usr/bin/ruby =begin comments.rb author Jan Bodnar =end # prints message to the terminal puts "Comments example"
An example showing both types of comments. Comments are ignored by the Ruby interpreter.
=begin comments.rb author Jan Bodnar =end
This is an example of a multi-line comment. The two tokens must start at the beginning of the line.
Ruby white space
White space in Ruby is used to separate tokens and terminate statements in the source file. It is also used to improve readability of the source code.
if true then puts "A message" end
White spaces are required in some places. For example between the if
keyword and the true
keyword. Or between the puts
method
and the actual string. In other places, it is forbidden. It cannot be present in
variable identifiers or language keywords.
a=1 b = 2 c = 3
The amount of space put between tokens is irrelevant for the Ruby interpreter. However, it is important to have one style throughout the project.
#!/usr/bin/ruby x = 5 + 3 puts x x = 5 + 3 puts x x = 5 + 3 puts x
A new line, a form of a white space, can be used to terminate statements.
x = 5 + 3
In the first case, we have one statement. The sum of the addition is
assigned to the x
variable. The variable holds 8.
x = 5 + 3
Now, there are two statements. The first statement is terminated with a
newline. The x
variable is 5. There is another statement, +3, which
has no effect.
x = 5 + 3
Finally, we have one statement. The newline is preceded with a + binary
operator, so the interpreter expects another value. It looks on the second line.
In this case, it takes both lines as one statement. The x
variable
is 8.
$ ./whitespace.rb 8 5 8
Ruby variables
A variable is an identifier, which holds a value. In programming we say that we assign a value to a variable. Technically speaking, a variable is a reference to a computer memory, where the value is stored. In Ruby, a variable can hold a string, a number or various objects. Variables can be assigned different values over time.
Variable names in Ruby are created from alphanumeric characters and underscore (_) character. A variable cannot begin with a number. The interpreter can easier distinguish between a literal number and a variable. Variable names cannot begin with a capital letter. If an identifier begins with a capital letter, it is considered to be a constant in Ruby.
Value value2 company_name
These are valid variable names.
12Val exx$ first-name
These are examples of invalid variable names.
Variable names may be preceded by two special characters, $
and
@
. They are used to create a specific variable scope.
The variables are case sensitive. This means that price
,
and pRice
are two different identifiers.
#!/usr/bin/ruby number = 10 numBER = 11 puts number, numBER
In our script, we assign two numeric values to two identifiers.
The number
and numBER
are two different variables.
However, this practice is discouraged. The names of the variables should not
be misleading.
$ ./case.rb 10 11
Ruby constants
Constants are value holders which hold only one value over time. An identifier with a first uppercase letter is a constant in Ruby. In programming it is a convention to write all characters of a constant in uppercase.
Unlike in other languages, Ruby does not enforce constants to have only one value over time. The interpreter only issues a warning if we assign a new value to an existing constant.
#!/usr/bin/ruby Name = "Robert" AGE = 23 Name = "Juliet"
In the above example, we create two constants. One of the constants is redefined later.
Name = "Robert" AGE = 23
Two constants are created. When the identifier's name begins with an uppercase letter, than we have a constant in Ruby. By convention, constants are usually written in upperacse letters.
Name = "Juliet"
We redefine a constant. Which issues a warning.
$ ./constants.rb ./constants.rb:6: warning: already initialized constant Name ./constants.rb:3: warning: previous definition of Name was here
Running the example.
Ruby literal
A literal is a textual representation of a particular value of a type. Literal types include boolean, integer, floating point, string, character, and date. Technically, a literal will be assigned a value at compile time, while a variable will be assigned at runtime.
age = 29 nationality = "Hungarian"
Here we assign two literals to variables. Number 29 and string "Hungarian" are literals.
#!/usr/bin/ruby require 'date' sng = true name = "James" job = nil weight = 68.5 born = Date.parse("November 12, 1986") puts "His name is #{name}" if sng == true puts "He is single" else puts "He is in a relationship" end puts "His job is #{job}" puts "He weighs #{weight} kilograms" puts "He was born in #{born}"
In the above example, we have multiple literals. The Boolean literal may have value true or false. James is a string literal. The nil is an absence of a value. 68.5 is a floating point literal. Finally, the November 12, 1987 is a date literal.
$ ./literals.rb His name is James He is single His job is He weighs 68.5 kilograms He was born in 1986-11-12
Ruby blocks
Ruby statements are often organized into blocks of code. A code block
can be delimited using { }
characters or do
and end
keywords.
#!/usr/bin/ruby puts [2, -1, -4, 0].delete_if { |x| x < 0 } [1, 2, 3].each do |e| puts e end
In the example, we show two code blocks.
Flow control of Ruby code is often done with the if
keyword.
The keyword is followed by a block of code. In this case a block of code
is delimited by then
, end
keywords, where
the first keyword is optional.
#!/usr/bin/ruby if true then puts "Ruby language" puts "Ruby script" end
In the above example, we have a simple block of code. It has two
statements. The block is delimited by then
, end
keywords. The then
keyword can be omitted.
Ruby sigils
Sigils $, @
are special characters that denote a scope in a
variable. The $
is used for global variables, @
for
instance variables and @@
for class variables.
$car_name = "Peugeot" @sea_name = "Black sea" @@species = "Cat"
Sigils are always placed at the beginning of the variable identifier.
Ruby operators
An operator is a symbol used to perform an action on some value. (answers.com)
! + - ~ * ** / % << >> & | ^ == === != <=> >= > < <= = %= /= -= += *= **= .. ... not and or ?: && ||
This is a list of operators available in Ruby language. We talk about operators later in the tutorial.
Ruby delimiters
A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data stream. (wikipedia)
( ) [ ] { } , ; ' " | |
#!/usr/bin/ruby name = "Jane" occupation = 'Student' numbers = [ 2, 3, 5, 3, 6, 2 ] puts name; puts occupation puts numbers[2] numbers.each { |i| puts i } puts ( 2 + 3 ) * 5
In the above example, we show the usage of various Ruby delimiters.
name = "Jane" occupation = 'Student'
Single and double quotes are used to delimit a string in Ruby.
numbers = [ 2, 3, 5, 3, 6, 2 ]
The square brackets are used to set boundaries for an array. The commas are used to separate items in the array.
puts name; puts occupation
The semicolon is used in Ruby to separate two statements in a Ruby source code.
puts numbers[2]
Delimiters can be used in different contexts. Here the square brackets are used to access an item in the array.
numbers.each { |i| puts i }
Curly brackets are used to define a block of code. Pipes are used to define an element, which is filled with a current array item for each loop cycle.
puts ( 2 + 3 ) * 5
Parentheses can be used to alter the evaluation of an expression.
Ruby keywords
A keyword is a reserved word in the Ruby programming language. Keywords are used to perform a specific task in the computer program. For example, print a value to the console, do repetitive tasks or perform logical operations. A programmer cannot use a keyword as an ordinary variable.
alias and BEGIN begin break case class def defined? do else elsif END end ensure false for if in module next nil not or redo rescue retry return self super then true undef unless until when while yield
This is a list of Ruby keywords.
This was the Ruby lexical structure.