Ensamblador para toyvm -- java campo con assembly camp codereview Relacionados El problema

Assembler for ToyVM


3
vote

problema

Español

Después de rodar mi propia virtual Máquina , decidí implementar un ensamblador para ello. Irónicamente, todo es Java, ya que necesitaba hacer muchas manipulaciones de texto. Por favor, dime algo que viene a la mente.

Aquí está el componente principal:

toyvmassembler.java :

  package net.coderodde.toy.assembler;  import java.io.FileNotFoundException; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Objects;  /**  * This class is responsible for assembling a ToyVM source file.  *   * @author Rodion "rodde" Efremov  * @version 1.6 (Mar 10, 2016).  */ public class ToyVMAssembler {      public static final byte REG1 = 0x00;     public static final byte REG2 = 0x01;     public static final byte REG3 = 0x02;     public static final byte REG4 = 0x03;      public static final byte ADD = 0x01;     public static final byte NEG = 0x02;     public static final byte MUL = 0x03;     public static final byte DIV = 0x04;     public static final byte MOD = 0x05;      public static final byte CMP = 0x10;     public static final byte JA  = 0x11;     public static final byte JE  = 0x12;     public static final byte JB  = 0x13;     public static final byte JMP = 0x14;      public static final byte CALL = 0x20;     public static final byte RET  = 0x21;      public static final byte LOAD  = 0x30;     public static final byte STORE = 0x31;     public static final byte CONST = 0x32;      public static final byte HALT = 0x40;     public static final byte INT  = 0x41;     public static final byte NOP  = 0x42;      public static final byte PUSH     = 0x50;     public static final byte PUSH_ALL = 0x51;     public static final byte POP      = 0x52;     public static final byte POP_ALL  = 0x53;     public static final byte LSP      = 0x54;      /**      * Specifies the token starting a one-line comment.      */     private static final String COMMENT_START_TOKEN = "//";      private final List<String> sourceCodeLineList;     private final List<Byte> machineCode = new ArrayList<>();     private final Map<Integer, String> mapAddressToLabel = new HashMap<>();     private final Map<String, Integer> mapLabelToAddress = new HashMap<>();     private final Map<String, InstructionAssembler> mapOpcodeToAssembler              = new HashMap<>();      private final Map<String, Integer> mapWordNameToWordValue = new HashMap<>();     private final Map<String, String> mapStringNameToStringValue              = new HashMap<>();      private final Map<String, Integer> mapWordNameToAddress   = new HashMap<>();     private final Map<String, Integer> mapStringNameToAddress = new HashMap<>();      private final Map<Integer, String> mapAddressToWordName = new HashMap<>();     private final Map<Integer, String> mapAddressToStringName = new HashMap<>();      private final Map<Integer, String> mapAddressToName = new HashMap<>();      private final List<String> pendingLabels = new ArrayList<>();     private final String fileName;     private int lineNumber = 1;      @FunctionalInterface     private interface InstructionAssembler {         void assemble(String line);     }      public ToyVMAssembler(String fileName, List<String> sourceCodeLineList) {         Objects.requireNonNull(sourceCodeLineList,                                "The input source code line list is null.");         Objects.requireNonNull(fileName, "The input file name is null.");          this.sourceCodeLineList  = sourceCodeLineList;         this.fileName = fileName;          buildOpcodeMap();     }      private void buildOpcodeMap() {         mapOpcodeToAssembler.put("add",   this::assembleAdd    );         mapOpcodeToAssembler.put("neg",   this::assembleNeg    );         mapOpcodeToAssembler.put("mul",   this::assembleMul    );         mapOpcodeToAssembler.put("div",   this::assembleDiv    );         mapOpcodeToAssembler.put("mod",   this::assembleMod    );         mapOpcodeToAssembler.put("cmp",   this::assembleCmp    );         mapOpcodeToAssembler.put("ja",    this::assembleJa     );         mapOpcodeToAssembler.put("je",    this::assembleJe     );         mapOpcodeToAssembler.put("jb",    this::assembleJb     );         mapOpcodeToAssembler.put("jmp",   this::assembleJmp    );         mapOpcodeToAssembler.put("call",  this::assembleCall   );         mapOpcodeToAssembler.put("ret",   this::assembleRet    );         mapOpcodeToAssembler.put("load",  this::assembleLoad   );         mapOpcodeToAssembler.put("store", this::assembleStore  );         mapOpcodeToAssembler.put("const", this::assembleConst  );         mapOpcodeToAssembler.put("halt",  this::assembleHalt   );         mapOpcodeToAssembler.put("int",   this::assembleInt    );         mapOpcodeToAssembler.put("nop",   this::assembleNop    );         mapOpcodeToAssembler.put("push",  this::assemblePush   );         mapOpcodeToAssembler.put("pusha", this::assemblePushAll);         mapOpcodeToAssembler.put("pop",   this::assemblePop    );         mapOpcodeToAssembler.put("popa",  this::assemblePopAll );         mapOpcodeToAssembler.put("lsp",   this::assembleLsp    );         mapOpcodeToAssembler.put("word",  this::assembleWord   );         mapOpcodeToAssembler.put("str",   this::assembleString );     }      public byte[] assemble() {         for (String sourceCodeLine : sourceCodeLineList) {             assembleSourceCodeLine(sourceCodeLine);             lineNumber++;         }          resolveWords();         resolveStrings();         resolveLabels();          resolveReferences();         return convertMachineCodeToByteArray();     }      private void resolveWords() {         for (Map.Entry<String, Integer> entry :                  mapWordNameToWordValue.entrySet()) {             mapWordNameToAddress.put(entry.getKey(), machineCode.size());             emitData(entry.getValue());         }          for (Map.Entry<Integer, String> entry :                 mapAddressToWordName.entrySet()) {             setAddress(entry.getKey(),                        mapWordNameToAddress.get(entry.getValue()));         }     }      private void resolveStrings() {         for (Map.Entry<String, String> entry :                 mapStringNameToStringValue.entrySet()) {             mapStringNameToAddress.put(entry.getKey(), machineCode.size());             emitString(entry.getValue());         }          for (Map.Entry<Integer, String> entry :                 mapAddressToStringName.entrySet()) {             setAddress(entry.getKey(),                        mapStringNameToAddress.get(entry.getValue()));         }     }      // Resolves all symbolical references (labels).     private void resolveLabels() {         for (Map.Entry<Integer, String> entry : mapAddressToLabel.entrySet()) {             String label = entry.getValue();              if (!mapLabelToAddress.containsKey(label)) {                 throw new AssemblyException(                         "ERROR: Label "" + label + "" is not defined.");             }              Integer address = mapLabelToAddress.get(label);             setAddress(entry.getKey(), address);         }     }      private void resolveReferences() {         for (Map.Entry<Integer, String> entry : mapAddressToName.entrySet()) {             String name = entry.getValue();              if (mapStringNameToAddress.containsKey(name)) {                 setAddress(entry.getKey(), mapStringNameToAddress.get(name));             } else if (mapWordNameToAddress.containsKey(name)) {                 setAddress(entry.getKey(), mapWordNameToAddress.get(name));             } else {                 throw new AssemblyException(                         errorHeader() +                         """ + name + "" is not declared.");             }         }     }      private void assembleSourceCodeLine(String line) {         // Prune the possible comment.         line = line.split(COMMENT_START_TOKEN)[0].trim();         // Deal with the possible label.         String[] parts = handleLabel(line);         String actualLine;          if (parts.length == 1) {             actualLine = parts[0];         } else {             pendingLabels.add(parts[0]);             actualLine = parts[1];         }          if (actualLine.trim().isEmpty()) {             // Omit empty line.             return;         }          // Resolve all preceding labels.         pendingLabels.stream().forEach((label) -> {             mapLabelToAddress.put(label, machineCode.size());         });          pendingLabels.clear();          // Switch to assembing the actual instruction.         InstructionAssembler instructionAssembler =                  mapOpcodeToAssembler.get(toTokens(actualLine)[0]);          if (instructionAssembler == null) {             throw new AssemblyException(                     errorHeader() +                     "Unknown instruction "" + actualLine + "".");         }          instructionAssembler.assemble(actualLine);     }      private void emitRegister(String registerToken) {         switch (registerToken) {             case "reg1":                 machineCode.add(REG1);                 return;              case "reg2":                 machineCode.add(REG2);                 return;              case "reg3":                 machineCode.add(REG3);                 return;              case "reg4":                 machineCode.add(REG4);                 return;              default:                 throw new AssemblyException(                         errorHeader() +                         "Unknown register token: "" + registerToken + "".");         }     }      private void emitAddress(int address) {         machineCode.add((byte) (address & 0xff));         machineCode.add((byte)((address >>>= 8) & 0xff));         machineCode.add((byte)((address >>>= 8) & 0xff));         machineCode.add((byte)((address >>>= 8) & 0xff));     }      private void emitData(int data) {         emitAddress(data);     }      private void emitByte(byte b) {         machineCode.add(b);     }      private void emitString(String string) {         for (char c : string.toCharArray()) {             // We support only ANSI.             machineCode.add((byte) c);         }          // Zero-terminate the string.         machineCode.add((byte) 0);     }      private void emitOpcode(byte opcode) {         machineCode.add(opcode);     }      private void setAddress(int index, int address) {         machineCode.set(index, (byte)(address & 0xff));         machineCode.set(index + 1, (byte)((address >>>= 8) & 0xff));         machineCode.set(index + 2, (byte)((address >>>= 8) & 0xff));         machineCode.set(index + 3, (byte)((address >>>= 8) & 0xff));     }      private void assembleAdd(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'add' instruction requires exactly three tokens: " +                     ""add regi regj"");         }          emitOpcode(ADD);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleNeg(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'neg' instruction requires exactly two tokens: " +                     ""neg regi"");         }          emitOpcode(NEG);         emitRegister(tokens[1]);     }      private void assembleMul(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'mul' instruction requires exactly three tokens: " +                     ""mul regi regj"");         }          emitOpcode(MUL);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleDiv(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'div' instruction requires exactly three tokens: " +                     ""div regi regj"");         }          emitOpcode(DIV);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleMod(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'mod' instruction requires exactly three tokens: " +                     ""mod regi regj"");         }          emitOpcode(MOD);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleCmp(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'cmp' instruction requires exactly three tokens: " +                     ""cmp regi regj"");         }          emitOpcode(CMP);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleJa(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'ja' instruction requires exactly two tokens: " +                     ""ja label" or "ja address"");         }          emitOpcode(JA);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleJe(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'je' instruction requires exactly two tokens: " +                     ""je label" or "je address"");         }          emitOpcode(JE);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleJb(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'jb' instruction requires exactly two tokens: " +                     ""jb label" or "jb address"");         }          emitOpcode(JB);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleJmp(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'jmp' instructoin requires exactly two tokens: " +                     ""jmp label" or "jmp address"");         }          emitOpcode(JMP);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleCall(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'call' instruction requires exactly two tokens: " +                     ""call label" or "call address"");         }          emitOpcode(CALL);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleRet(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                      "The 'ret' instruction must not have any arguments.");         }          emitOpcode(RET);     }      private void assembleLoad(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'load' instruction requires exactly three tokens: " +                     ""load regi address" or "load regi label"");         }          emitOpcode(LOAD);         emitRegister(tokens[1]);          if (isHexInteger(tokens[2])) {             emitAddress(hexStringToInteger(tokens[2]));         } else if (isInteger(tokens[2])) {             emitAddress(toInteger(tokens[2]));         } else {             mapAddressToName.put(machineCode.size(), tokens[2]);             emitAddress(0);         }     }      private void assembleStore(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'store' instruction requires exactly three tokens: " +                     ""store regi address" or "store regi label"");         }          emitOpcode(STORE);         emitRegister(tokens[1]);          if (isHexInteger(tokens[2])) {             emitAddress(hexStringToInteger(tokens[2]));         } else if (isInteger(tokens[2])) {             emitAddress(toInteger(tokens[2]));         } else {             mapAddressToName.put(machineCode.size(), tokens[2]);             emitAddress(0);         }     }      private void assembleConst(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'const' instruction requires exactly three tokens: " +                     ""cosnt regi constant"");         }          emitOpcode(CONST);         emitRegister(tokens[1]);          if (isHexInteger(tokens[2])) {             emitData(hexStringToInteger(tokens[2]));         } else if (isInteger(tokens[2])) {             emitData(toInteger(tokens[2]));         } else {             mapAddressToName.put(machineCode.size(), tokens[2]);             emitAddress(0);         }     }      private void assembleHalt(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                     "The 'halt' instruction must not have any arguments.");         }          emitOpcode(HALT);     }      private void assembleInt(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'int' instruction requires exactly two tokens: " +                     ""int interrupt_number"");         }          emitOpcode(INT);          if (isHexInteger(tokens[1])) {             emitByte((byte) hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitByte((byte) toInteger(tokens[1]));         } else {             throw new AssemblyException(                     "The interrupt number is not a valid decimal or " +                     "hexadecimal integer: "" + tokens[1] + "".");         }     }      private void assembleNop(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                      "The 'nop' instruction must not have arguments.");         }          emitOpcode(NOP);     }      private void assemblePush(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'push' instruction requires exactly two tokens: " +                      ""push regi"");         }          emitOpcode(PUSH);         emitRegister(tokens[1]);     }      private void assemblePushAll(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                     "The 'pusha' instruction must not have arguments.");         }          emitOpcode(PUSH_ALL);     }      private void assemblePop(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'pop' instruction requires exactly two tokens: " +                      ""pop regi"");         }          emitOpcode(POP);         emitRegister(tokens[1]);     }      private void assemblePopAll(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                     "The 'popa' instruction must not have arguments.");         }          emitOpcode(POP_ALL);     }      private void assembleLsp(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'lsp' instruction must contain exactly two tokens: " +                     ""lsp regi"");         }          emitOpcode(LSP);         emitRegister(tokens[1]);     }      private void assembleWord(String line) {         if (!pendingLabels.isEmpty()) {             throw new AssemblyException(                     errorHeader() +                     "The word declaration expression must not have labels.");         }          String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                      "The 'word' instruction requireis exactly three tokens: " +                     ""word name value"");         }          int datum;          if (isHexInteger(tokens[2])) {             datum = hexStringToInteger(tokens[2]);         } else if (isInteger(tokens[2])) {             datum = toInteger(tokens[2]);         } else {             throw new AssemblyException(                     "Cannot parse "" + tokens[2] + "" as a decimal or " +                      "hexadecimal integer.");         }          if (mapWordNameToWordValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "Word with name "" + tokens[1] + "" is already defined.");         }          if (mapStringNameToStringValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "There is already a string with name "" + tokens[1] +                      """);         }          mapWordNameToWordValue.put(tokens[1], datum);     }      private void assembleString(String line) {         if (!pendingLabels.isEmpty()) {             throw new AssemblyException(                     errorHeader() +                      "The string declaration expression must not have labels.");         }          int firstQuoteIndex = line.indexOf(""");          if (firstQuoteIndex == -1) {             throw new AssemblyException(                     errorHeader() +                     "The string must be enclosed in double quotation marks: " +                     "str name "string content"");         }          int lastQuoteIndex  = line.lastIndexOf(""");          if (firstQuoteIndex == lastQuoteIndex) {             throw new AssemblyException(                     errorHeader() +                     "The string declaration has only one double quote: " +                     "requires exactly two.");         }          String str = line.substring(firstQuoteIndex + 1, lastQuoteIndex);         String[] tokens = toTokens(line);          if (tokens.length < 3) {             throw new AssemblyException(                     errorHeader() +                      "The 'str' instruction requires exactly three tokens: " +                     ""str name value"");         }          if (mapStringNameToStringValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "String with name "" + tokens[1] +                      "" is alredy defined.");         }          if (mapWordNameToWordValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "There is already a word with name "" + tokens[1] + """);         }          str = str.replace("\n", " ");         mapStringNameToStringValue.put(tokens[1], str);     }      private boolean isInteger(String token) {         try {             Integer.parseInt(token);             return true;         } catch (NumberFormatException ex) {             return false;         }     }      private boolean isHexInteger(String token) {         if (token.length() < 3                  || (!token.startsWith("0X") && !token.startsWith("0x"))) {             return false;         }          String body = token.substring(2).toLowerCase();          try {             Long.parseLong(body, 16);             return true;         } catch (NumberFormatException ex) {             return false;         }     }      private int hexStringToInteger(String token) {         if (!isHexInteger(token)) {             throw new IllegalArgumentException(                     "The input token is not a hexadecimal number.");         }          return (int) Long.parseLong(token.substring(2).toLowerCase(), 16);      }      private int toInteger(String token) {         return Integer.parseInt(token);     }      private String[] toTokens(String line) {         return line.split("\s+");     }      private String[] handleLabel(String line) {         int colonIndex = line.indexOf(":");          if (colonIndex == -1) {             return new String[]{ line };         }          if (line.indexOf(":", colonIndex + 1) != -1) {             throw new AssemblyException(                     errorHeader() +                     "Only one label allowed per line. The input line is "" +                     line + "".");         }          String label = line.substring(0, colonIndex).trim();         String actualLine = line.substring(colonIndex + 1,                                            line.length()).trim();          this.mapLabelToAddress.put(label, machineCode.size());          return new String[] {              label,             actualLine         };     }      private byte[] convertMachineCodeToByteArray() {         byte[] code = new byte[machineCode.size()];          for (int i = 0; i < code.length; ++i) {             code[i] = machineCode.get(i);         }          return code;     }      private String errorHeader() {         return "ERROR in file "" + fileName +                 "" at line " + lineNumber + ": ";     } }   
  • Todos los archivos son aquí .
  • Un paquete pequeño para la demostración es aquí .
  • La máquina virtual es aquí .
Original en ingles

After rolling my own virtual machine, I decided to implement an assembler for it. Ironically, it's all Java, since I needed to do a lot of text manipulations. Please, tell me anything that comes to mind.

Here is the main component:

ToyVMAssembler.java:

package net.coderodde.toy.assembler;  import java.io.FileNotFoundException; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Objects;  /**  * This class is responsible for assembling a ToyVM source file.  *   * @author Rodion "rodde" Efremov  * @version 1.6 (Mar 10, 2016).  */ public class ToyVMAssembler {      public static final byte REG1 = 0x00;     public static final byte REG2 = 0x01;     public static final byte REG3 = 0x02;     public static final byte REG4 = 0x03;      public static final byte ADD = 0x01;     public static final byte NEG = 0x02;     public static final byte MUL = 0x03;     public static final byte DIV = 0x04;     public static final byte MOD = 0x05;      public static final byte CMP = 0x10;     public static final byte JA  = 0x11;     public static final byte JE  = 0x12;     public static final byte JB  = 0x13;     public static final byte JMP = 0x14;      public static final byte CALL = 0x20;     public static final byte RET  = 0x21;      public static final byte LOAD  = 0x30;     public static final byte STORE = 0x31;     public static final byte CONST = 0x32;      public static final byte HALT = 0x40;     public static final byte INT  = 0x41;     public static final byte NOP  = 0x42;      public static final byte PUSH     = 0x50;     public static final byte PUSH_ALL = 0x51;     public static final byte POP      = 0x52;     public static final byte POP_ALL  = 0x53;     public static final byte LSP      = 0x54;      /**      * Specifies the token starting a one-line comment.      */     private static final String COMMENT_START_TOKEN = "//";      private final List<String> sourceCodeLineList;     private final List<Byte> machineCode = new ArrayList<>();     private final Map<Integer, String> mapAddressToLabel = new HashMap<>();     private final Map<String, Integer> mapLabelToAddress = new HashMap<>();     private final Map<String, InstructionAssembler> mapOpcodeToAssembler              = new HashMap<>();      private final Map<String, Integer> mapWordNameToWordValue = new HashMap<>();     private final Map<String, String> mapStringNameToStringValue              = new HashMap<>();      private final Map<String, Integer> mapWordNameToAddress   = new HashMap<>();     private final Map<String, Integer> mapStringNameToAddress = new HashMap<>();      private final Map<Integer, String> mapAddressToWordName = new HashMap<>();     private final Map<Integer, String> mapAddressToStringName = new HashMap<>();      private final Map<Integer, String> mapAddressToName = new HashMap<>();      private final List<String> pendingLabels = new ArrayList<>();     private final String fileName;     private int lineNumber = 1;      @FunctionalInterface     private interface InstructionAssembler {         void assemble(String line);     }      public ToyVMAssembler(String fileName, List<String> sourceCodeLineList) {         Objects.requireNonNull(sourceCodeLineList,                                "The input source code line list is null.");         Objects.requireNonNull(fileName, "The input file name is null.");          this.sourceCodeLineList  = sourceCodeLineList;         this.fileName = fileName;          buildOpcodeMap();     }      private void buildOpcodeMap() {         mapOpcodeToAssembler.put("add",   this::assembleAdd    );         mapOpcodeToAssembler.put("neg",   this::assembleNeg    );         mapOpcodeToAssembler.put("mul",   this::assembleMul    );         mapOpcodeToAssembler.put("div",   this::assembleDiv    );         mapOpcodeToAssembler.put("mod",   this::assembleMod    );         mapOpcodeToAssembler.put("cmp",   this::assembleCmp    );         mapOpcodeToAssembler.put("ja",    this::assembleJa     );         mapOpcodeToAssembler.put("je",    this::assembleJe     );         mapOpcodeToAssembler.put("jb",    this::assembleJb     );         mapOpcodeToAssembler.put("jmp",   this::assembleJmp    );         mapOpcodeToAssembler.put("call",  this::assembleCall   );         mapOpcodeToAssembler.put("ret",   this::assembleRet    );         mapOpcodeToAssembler.put("load",  this::assembleLoad   );         mapOpcodeToAssembler.put("store", this::assembleStore  );         mapOpcodeToAssembler.put("const", this::assembleConst  );         mapOpcodeToAssembler.put("halt",  this::assembleHalt   );         mapOpcodeToAssembler.put("int",   this::assembleInt    );         mapOpcodeToAssembler.put("nop",   this::assembleNop    );         mapOpcodeToAssembler.put("push",  this::assemblePush   );         mapOpcodeToAssembler.put("pusha", this::assemblePushAll);         mapOpcodeToAssembler.put("pop",   this::assemblePop    );         mapOpcodeToAssembler.put("popa",  this::assemblePopAll );         mapOpcodeToAssembler.put("lsp",   this::assembleLsp    );         mapOpcodeToAssembler.put("word",  this::assembleWord   );         mapOpcodeToAssembler.put("str",   this::assembleString );     }      public byte[] assemble() {         for (String sourceCodeLine : sourceCodeLineList) {             assembleSourceCodeLine(sourceCodeLine);             lineNumber++;         }          resolveWords();         resolveStrings();         resolveLabels();          resolveReferences();         return convertMachineCodeToByteArray();     }      private void resolveWords() {         for (Map.Entry<String, Integer> entry :                  mapWordNameToWordValue.entrySet()) {             mapWordNameToAddress.put(entry.getKey(), machineCode.size());             emitData(entry.getValue());         }          for (Map.Entry<Integer, String> entry :                 mapAddressToWordName.entrySet()) {             setAddress(entry.getKey(),                        mapWordNameToAddress.get(entry.getValue()));         }     }      private void resolveStrings() {         for (Map.Entry<String, String> entry :                 mapStringNameToStringValue.entrySet()) {             mapStringNameToAddress.put(entry.getKey(), machineCode.size());             emitString(entry.getValue());         }          for (Map.Entry<Integer, String> entry :                 mapAddressToStringName.entrySet()) {             setAddress(entry.getKey(),                        mapStringNameToAddress.get(entry.getValue()));         }     }      // Resolves all symbolical references (labels).     private void resolveLabels() {         for (Map.Entry<Integer, String> entry : mapAddressToLabel.entrySet()) {             String label = entry.getValue();              if (!mapLabelToAddress.containsKey(label)) {                 throw new AssemblyException(                         "ERROR: Label \"" + label + "\" is not defined.");             }              Integer address = mapLabelToAddress.get(label);             setAddress(entry.getKey(), address);         }     }      private void resolveReferences() {         for (Map.Entry<Integer, String> entry : mapAddressToName.entrySet()) {             String name = entry.getValue();              if (mapStringNameToAddress.containsKey(name)) {                 setAddress(entry.getKey(), mapStringNameToAddress.get(name));             } else if (mapWordNameToAddress.containsKey(name)) {                 setAddress(entry.getKey(), mapWordNameToAddress.get(name));             } else {                 throw new AssemblyException(                         errorHeader() +                         "\"" + name + "\" is not declared.");             }         }     }      private void assembleSourceCodeLine(String line) {         // Prune the possible comment.         line = line.split(COMMENT_START_TOKEN)[0].trim();         // Deal with the possible label.         String[] parts = handleLabel(line);         String actualLine;          if (parts.length == 1) {             actualLine = parts[0];         } else {             pendingLabels.add(parts[0]);             actualLine = parts[1];         }          if (actualLine.trim().isEmpty()) {             // Omit empty line.             return;         }          // Resolve all preceding labels.         pendingLabels.stream().forEach((label) -> {             mapLabelToAddress.put(label, machineCode.size());         });          pendingLabels.clear();          // Switch to assembing the actual instruction.         InstructionAssembler instructionAssembler =                  mapOpcodeToAssembler.get(toTokens(actualLine)[0]);          if (instructionAssembler == null) {             throw new AssemblyException(                     errorHeader() +                     "Unknown instruction \"" + actualLine + "\".");         }          instructionAssembler.assemble(actualLine);     }      private void emitRegister(String registerToken) {         switch (registerToken) {             case "reg1":                 machineCode.add(REG1);                 return;              case "reg2":                 machineCode.add(REG2);                 return;              case "reg3":                 machineCode.add(REG3);                 return;              case "reg4":                 machineCode.add(REG4);                 return;              default:                 throw new AssemblyException(                         errorHeader() +                         "Unknown register token: \"" + registerToken + "\".");         }     }      private void emitAddress(int address) {         machineCode.add((byte) (address & 0xff));         machineCode.add((byte)((address >>>= 8) & 0xff));         machineCode.add((byte)((address >>>= 8) & 0xff));         machineCode.add((byte)((address >>>= 8) & 0xff));     }      private void emitData(int data) {         emitAddress(data);     }      private void emitByte(byte b) {         machineCode.add(b);     }      private void emitString(String string) {         for (char c : string.toCharArray()) {             // We support only ANSI.             machineCode.add((byte) c);         }          // Zero-terminate the string.         machineCode.add((byte) 0);     }      private void emitOpcode(byte opcode) {         machineCode.add(opcode);     }      private void setAddress(int index, int address) {         machineCode.set(index, (byte)(address & 0xff));         machineCode.set(index + 1, (byte)((address >>>= 8) & 0xff));         machineCode.set(index + 2, (byte)((address >>>= 8) & 0xff));         machineCode.set(index + 3, (byte)((address >>>= 8) & 0xff));     }      private void assembleAdd(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'add' instruction requires exactly three tokens: " +                     "\"add regi regj\"");         }          emitOpcode(ADD);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleNeg(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'neg' instruction requires exactly two tokens: " +                     "\"neg regi\"");         }          emitOpcode(NEG);         emitRegister(tokens[1]);     }      private void assembleMul(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'mul' instruction requires exactly three tokens: " +                     "\"mul regi regj\"");         }          emitOpcode(MUL);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleDiv(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'div' instruction requires exactly three tokens: " +                     "\"div regi regj\"");         }          emitOpcode(DIV);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleMod(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'mod' instruction requires exactly three tokens: " +                     "\"mod regi regj\"");         }          emitOpcode(MOD);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleCmp(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'cmp' instruction requires exactly three tokens: " +                     "\"cmp regi regj\"");         }          emitOpcode(CMP);         emitRegister(tokens[1]);         emitRegister(tokens[2]);     }      private void assembleJa(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'ja' instruction requires exactly two tokens: " +                     "\"ja label\" or \"ja address\"");         }          emitOpcode(JA);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleJe(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'je' instruction requires exactly two tokens: " +                     "\"je label\" or \"je address\"");         }          emitOpcode(JE);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleJb(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'jb' instruction requires exactly two tokens: " +                     "\"jb label\" or \"jb address\"");         }          emitOpcode(JB);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleJmp(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'jmp' instructoin requires exactly two tokens: " +                     "\"jmp label\" or \"jmp address\"");         }          emitOpcode(JMP);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleCall(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                      "The 'call' instruction requires exactly two tokens: " +                     "\"call label\" or \"call address\"");         }          emitOpcode(CALL);          if (isHexInteger(tokens[1])) {             emitAddress(hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitAddress(toInteger(tokens[1]));         } else {             mapAddressToLabel.put(machineCode.size(), tokens[1]);             emitAddress(0);         }     }      private void assembleRet(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                      "The 'ret' instruction must not have any arguments.");         }          emitOpcode(RET);     }      private void assembleLoad(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'load' instruction requires exactly three tokens: " +                     "\"load regi address\" or \"load regi label\"");         }          emitOpcode(LOAD);         emitRegister(tokens[1]);          if (isHexInteger(tokens[2])) {             emitAddress(hexStringToInteger(tokens[2]));         } else if (isInteger(tokens[2])) {             emitAddress(toInteger(tokens[2]));         } else {             mapAddressToName.put(machineCode.size(), tokens[2]);             emitAddress(0);         }     }      private void assembleStore(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'store' instruction requires exactly three tokens: " +                     "\"store regi address\" or \"store regi label\"");         }          emitOpcode(STORE);         emitRegister(tokens[1]);          if (isHexInteger(tokens[2])) {             emitAddress(hexStringToInteger(tokens[2]));         } else if (isInteger(tokens[2])) {             emitAddress(toInteger(tokens[2]));         } else {             mapAddressToName.put(machineCode.size(), tokens[2]);             emitAddress(0);         }     }      private void assembleConst(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                     "The 'const' instruction requires exactly three tokens: " +                     "\"cosnt regi constant\"");         }          emitOpcode(CONST);         emitRegister(tokens[1]);          if (isHexInteger(tokens[2])) {             emitData(hexStringToInteger(tokens[2]));         } else if (isInteger(tokens[2])) {             emitData(toInteger(tokens[2]));         } else {             mapAddressToName.put(machineCode.size(), tokens[2]);             emitAddress(0);         }     }      private void assembleHalt(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                     "The 'halt' instruction must not have any arguments.");         }          emitOpcode(HALT);     }      private void assembleInt(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'int' instruction requires exactly two tokens: " +                     "\"int interrupt_number\"");         }          emitOpcode(INT);          if (isHexInteger(tokens[1])) {             emitByte((byte) hexStringToInteger(tokens[1]));         } else if (isInteger(tokens[1])) {             emitByte((byte) toInteger(tokens[1]));         } else {             throw new AssemblyException(                     "The interrupt number is not a valid decimal or " +                     "hexadecimal integer: \"" + tokens[1] + "\".");         }     }      private void assembleNop(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                      "The 'nop' instruction must not have arguments.");         }          emitOpcode(NOP);     }      private void assemblePush(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'push' instruction requires exactly two tokens: " +                      "\"push regi\"");         }          emitOpcode(PUSH);         emitRegister(tokens[1]);     }      private void assemblePushAll(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                     "The 'pusha' instruction must not have arguments.");         }          emitOpcode(PUSH_ALL);     }      private void assemblePop(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'pop' instruction requires exactly two tokens: " +                      "\"pop regi\"");         }          emitOpcode(POP);         emitRegister(tokens[1]);     }      private void assemblePopAll(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 1) {             throw new AssemblyException(                     errorHeader() +                     "The 'popa' instruction must not have arguments.");         }          emitOpcode(POP_ALL);     }      private void assembleLsp(String line) {         String[] tokens = toTokens(line);          if (tokens.length != 2) {             throw new AssemblyException(                     errorHeader() +                     "The 'lsp' instruction must contain exactly two tokens: " +                     "\"lsp regi\"");         }          emitOpcode(LSP);         emitRegister(tokens[1]);     }      private void assembleWord(String line) {         if (!pendingLabels.isEmpty()) {             throw new AssemblyException(                     errorHeader() +                     "The word declaration expression must not have labels.");         }          String[] tokens = toTokens(line);          if (tokens.length != 3) {             throw new AssemblyException(                     errorHeader() +                      "The 'word' instruction requireis exactly three tokens: " +                     "\"word name value\"");         }          int datum;          if (isHexInteger(tokens[2])) {             datum = hexStringToInteger(tokens[2]);         } else if (isInteger(tokens[2])) {             datum = toInteger(tokens[2]);         } else {             throw new AssemblyException(                     "Cannot parse \"" + tokens[2] + "\" as a decimal or " +                      "hexadecimal integer.");         }          if (mapWordNameToWordValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "Word with name \"" + tokens[1] + "\" is already defined.");         }          if (mapStringNameToStringValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "There is already a string with name \"" + tokens[1] +                      "\"");         }          mapWordNameToWordValue.put(tokens[1], datum);     }      private void assembleString(String line) {         if (!pendingLabels.isEmpty()) {             throw new AssemblyException(                     errorHeader() +                      "The string declaration expression must not have labels.");         }          int firstQuoteIndex = line.indexOf("\"");          if (firstQuoteIndex == -1) {             throw new AssemblyException(                     errorHeader() +                     "The string must be enclosed in double quotation marks: " +                     "str name \"string content\"");         }          int lastQuoteIndex  = line.lastIndexOf("\"");          if (firstQuoteIndex == lastQuoteIndex) {             throw new AssemblyException(                     errorHeader() +                     "The string declaration has only one double quote: " +                     "requires exactly two.");         }          String str = line.substring(firstQuoteIndex + 1, lastQuoteIndex);         String[] tokens = toTokens(line);          if (tokens.length < 3) {             throw new AssemblyException(                     errorHeader() +                      "The 'str' instruction requires exactly three tokens: " +                     "\"str name value\"");         }          if (mapStringNameToStringValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "String with name \"" + tokens[1] +                      "\" is alredy defined.");         }          if (mapWordNameToWordValue.containsKey(tokens[1])) {             throw new AssemblyException(                     errorHeader() +                     "There is already a word with name \"" + tokens[1] + "\"");         }          str = str.replace("\\n", "\n");         mapStringNameToStringValue.put(tokens[1], str);     }      private boolean isInteger(String token) {         try {             Integer.parseInt(token);             return true;         } catch (NumberFormatException ex) {             return false;         }     }      private boolean isHexInteger(String token) {         if (token.length() < 3                  || (!token.startsWith("0X") && !token.startsWith("0x"))) {             return false;         }          String body = token.substring(2).toLowerCase();          try {             Long.parseLong(body, 16);             return true;         } catch (NumberFormatException ex) {             return false;         }     }      private int hexStringToInteger(String token) {         if (!isHexInteger(token)) {             throw new IllegalArgumentException(                     "The input token is not a hexadecimal number.");         }          return (int) Long.parseLong(token.substring(2).toLowerCase(), 16);      }      private int toInteger(String token) {         return Integer.parseInt(token);     }      private String[] toTokens(String line) {         return line.split("\\s+");     }      private String[] handleLabel(String line) {         int colonIndex = line.indexOf(":");          if (colonIndex == -1) {             return new String[]{ line };         }          if (line.indexOf(":", colonIndex + 1) != -1) {             throw new AssemblyException(                     errorHeader() +                     "Only one label allowed per line. The input line is \"" +                     line + "\".");         }          String label = line.substring(0, colonIndex).trim();         String actualLine = line.substring(colonIndex + 1,                                            line.length()).trim();          this.mapLabelToAddress.put(label, machineCode.size());          return new String[] {              label,             actualLine         };     }      private byte[] convertMachineCodeToByteArray() {         byte[] code = new byte[machineCode.size()];          for (int i = 0; i < code.length; ++i) {             code[i] = machineCode.get(i);         }          return code;     }      private String errorHeader() {         return "ERROR in file \"" + fileName +                 "\" at line " + lineNumber + ": ";     } } 
  • All the files are here.
  • A small pack for demonstration is here.
  • The virtual machine is here.
     

Lista de respuestas

4
 
vote
vote
La mejor respuesta
 

Veo algunas cosas que pueden ayudarlo a mejorar su código.

Haga su código Más datos impulsados ​​por

Para cualquier instrucción dada, las piezas relevantes se distribuyen sobre un poco de código. Un mejor enfoque podría ser tener algo así como una clase 99887776665544330

Hacer que los errores informan más consistentes

Uno de los problemas con la repetición de un código muy similar varias veces es que los errores pequeños pueden arrastrarse y ser pasados ​​por alto en el volumen de código. Por ejemplo, la palabra "instrucción" está mal escrita con el código de error de instrucciones jmp1 , pero no para el jb u otras instrucciones, aunque sus cadenas de error son casi idénticas. < / p>

Tokenización separada de análisis

El ensamblador clásico o la construcción del compilador separa tokenización de análisis. El tokenizador (también llamado "lexer" por algunos) crea una serie de tokens, identificados por tipo y valor, al analizador. Esto permite que las construcciones más complejas, como una "expresión" que se analizarán sin tener que desordenar el código con las partes que determinan si algo es un número o una mnemónica de instrucciones o una referencia a un registro. Si lo hace, lo hará más fácil modificar y mantener ambas partes.

Use el manejo de cadenas existentes

El emitRegister utiliza un interruptor para enumerar cada registro y emitir el valor correspondiente. Uno también podría hacer algo así en lugar:

  String [] regNames= { "reg1", "reg2", "reg3", "reg4" }; int regnum = Arrays.binarySearch(regNames, registerToken); if (regnum >= 0) {     machineCode.add(regnum); } else {     throw ... }   

Ahora es trivial agregar "Reg5", por ejemplo, simplemente al listar su nombre.

Crear funciones para operaciones comunes

Hay varios lugares en los que un 9988776655544335 se convierte en un formato Little-Endian. El código sería más claro y compacto si esa operación se implementara como una función en su lugar.

Tenga cuidado de lo que acepta

En este momento, el ensamblador aceptará felizmente líneas como estas:

  jmp:    jmp jmp reg3:   ja reg3   

Es posible que esto sea intencional, pero no estoy convencido de que sea un buen diseño. En cualquier caso, esto es casi seguro que no está destinado:

  0004:   jb 0004   

Crea una etiqueta 0004 que no se puede usar, y luego monta una instrucción 99887776665544339 .

Considere usar herramientas de compilador real

Es posible que desee investigar jmp0 y jmp1 (o jmp2 y jmp3 Continúa utilizando Java para esto). Toman un poco de tiempo para aprender, pero vale la pena. Todo su programa, por ejemplo, se podría hacer simplemente en C usando jmp4 y jmp5 .

 

I see some things that may help you improve your code.

Make your code more data driven

For any given instruction, the relevant pieces are spread over quite a bit of code. A better approach might be to have something like an Instruction class that would contain the instruction string, the encoded hex value, the length and the number and type of arguments.

Make error reporting more consistent

One of the problems with repeating very similar code multiple times is that small errors can creep in and be overlooked in the volume of code. For example, the word "instruction" is misspelled for the jmp instruction error code, but not for the jb or other instructions, even though their error strings are nearly identical.

Separate tokenizing from parsing

Classical assembler or compiler construction separates tokenizing from parsing. The tokenizer (also called a "lexer" by some) creates a series of tokens, identified by type and value, to the parser. This allows for more complex constructs, such as an "expression" to be parsed without having to also clutter the code with the parts that determine whether something is a number or an instruction mnemonic or a reference to a register. Doing so will make it easier to modify and maintain both parts.

Use existing string handling

The emitRegister uses a switch to enumerate each register and emit the corresponding value. One could also do something like this instead:

String [] regNames= { "reg1", "reg2", "reg3", "reg4" }; int regnum = Arrays.binarySearch(regNames, registerToken); if (regnum >= 0) {     machineCode.add(regnum); } else {     throw ... } 

Now it's trivial to add "reg5" for example, just by listing its name.

Create functions for common operations

There are several places in which an int is turned into little-endian format. The code would be more clear and compact if that operation were implemented as a function instead.

Be careful of what you accept

Right now, the assembler will happily accept lines like these:

jmp:    jmp jmp reg3:   ja reg3 

It's possible that this is intentional, but I'm not convinced it's good design. In any case, this is almost certainly not intended:

0004:   jb 0004 

It creates a label 0004 which cannot be used, and then assembles a jb 0004 instruction.

Consider using real compiler tools

You might want to look into using flex and bison (or JFlex and BYACC if you want to continue using Java for this). They take a little bit of time to learn, but are well worth it. Your entire program, for instance could be done quite simply in C using flex and bison.

 
 
2
 
vote

Hay mucho código que es en su mayoría copia y pasta. Para cada instrucción, hay una constante de código de opso (por ejemplo, jmp6 ), una entrada de mapa (por ejemplo, jmp7 ) y un controlador (el método 99887766555443318 ).

Por lo tanto, hay una enorme oportunidad de generalizar. Cada instrucción consiste en un opcode, y puede tener cero a dos argumentos, que pueden ser valores, registros o direcciones inmediatos. Puede usar un enumeración, o podría tener todas las instrucciones definidas en un archivo de configuración.

 

There's a lot of code that is mostly copy-and-paste. For each instruction, there's a opcode constant (e.g. public static final byte ADD = 0x01;), a map entry (e.g. mapOpcodeToAssembler.put("add", this::assembleAdd );) and a handler (the assembleAdd() method).

There is therefore an enormous opportunity to generalize. Each instruction consists of an opcode, and may have zero to two arguments, which may be immediate values, registers, or addresses. You could use an Enum, or you could have all the instructions defined in a configuration file.

 
 

Relacionados problema

5  Sumando todos los primos por debajo de 2,000,000 - Project Euler # 10 en ensamblaje  ( Summing all primes below 2 000 000 project euler 10 in assembly ) 
Actualmente estoy aprendiendo asambleas para la universidad, y me gustaría escuchar algunos comentarios sobre lo que he escrito. Actualmente he implementado p...

13  Programa de cuenta regresiva en X86 NASM  ( Countdown program in x86 nasm ) 
Soy bastante nuevo para la programación de idiomas de montaje y, para la práctica, me di un problema: cuenta desde 10 y justo después de 1, di "¡Blast Off!". ...

2  X86-16 Escribiendo cadenas asciiz directamente al video  ( X86 16 writing asciiz strings directly to video ) 
Mientras desarrolla mi sistema operativo, decidí que había una necesidad de ser más verbosa sobre lo que estaba sucediendo en modo real. La idea de incrustar ...

5  Mostrar valor hexadecimal almacenado en un registro  ( Display hexadecimal value stored at a register ) 
Leí un libro sobre el desarrollo del sistema operativo y se enfrenta a un ejercicio simple: Escriba una función que imprime un valor hexadecimal almacenado e...

16  X64 Assembly ClearMem / Zeromem  ( X64 assembly clearmem zeromem ) 
Acabo de empezar a aprender a la asamblea de ayer, y la primera cosa útil que he escrito es una función 99887776655544330 . Busco comentarios generales con...

3  Leyendo todos los contenidos de archivos a través de la Asamblea X64  ( Reading all file contents via x64 assembly ) 
He surgido con el siguiente fragmento mediante la construcción de las respuestas que se le da a mi Pregunta de StackOverflow . Solo tratando de obtener otros...

7  8086 ASM BRESENHAM LINE ALGORITHM PT2  ( 8086 asm bresenhams line algorithm pt2 ) 
Siguiendo la Revisión exitosa de mi implementación de la línea de algoritmo de Bresenham, i Se le ha pedido que cargue la implementación completa de mi proy...

3  Calculadora de RPN de NASM  ( Nasm rpn calculator ) 
He estado aprendiendo ensamblaje en los últimos días, y he hecho una simple calculadora RPN. Aquí está la lógica principal del programa, excluyendo las func...

6  Encontrar el máximo de una lista dada de datos en la Asamblea GNU X86 (32 bits)  ( Finding the maximum of a given list of data in gnu assembly x86 32 bit ) 
Estoy siguiendo el libro Programación de la base hacia arriba y como respuesta a una pregunta En la sección use los conceptos del capítulo 4: Conviert...

1  X86-16 Función 01 -> Cambiar destino y / o Páginas de visualización  ( X86 16 function 01 change destination and or display pages ) 
Este código está diseñado para ser incluido con x86-16 Escribiendo cadenas asciiz directamente a video y depende de algunas de las declaraciones en ese códi...




© 2022 respuesta.top Reservados todos los derechos. Centro de preguntas y respuestas reservados todos los derechos